DecompositionMethodologyforClassificationTasks-AMetaDecomposerFramework
LiorRokach1
DepartmentofInformationSystemsEngineeringBen-GurionUniversityoftheNegevP.O.B.653,Beer-Sheva84105,Israelliorrk@bgu.ac.il
Thedateofreceiptandacceptancewillbeinsertedbytheeditor
AbstractTheideaofdecompositionmethodologyforclassificationtasksistobreakdownacomplexclassificationtaskintoseveralsimplerandmoremanageablesub-tasksthataresolvablebyusingexistinginductionmethods,thenjoiningtheirsolutionstogetherinordertosolvetheoriginalproblem.Inthispaperweprovideanoverviewofverypopularbutdiversedecomposi-tionmethodsandintroducearelatedtaxonomytocategorizethem.Subse-quentlywesuggestusingthistaxonomytocreateanovelmeta-decomposerframeworktoautomaticallyselecttheappropriatedecompositionmethodforagivenproblem.Theexperimentalstudyvalidatestheeffectivenessoftheproposedmeta-decomposeronasetofbenchmarkdatasets.1Introduction
Oneoftheexplicitchallengesinclassificationtasksistodevelopmethodsthatwillbefeasibleforcomplicatedreal-worldproblems.Inmanydisci-plines,whenaproblembecomesmorecomplex,thereisanaturaltendencytotrytobreakitdownintosmaller,distinctbutconnectedpieces.Theconceptofbreakingdownasystemintosmallerpiecesisgenerallyreferredtoasdecomposition.Thepurposeofdecompositionmethodologyistobreakdownacomplexproblemintosmaller,lesscomplexandmoremanageable,sub-problemsthataresolvablebyusingexistingtools,thenjoiningthemtogethertosolvetheinitialproblem.Decompositionmethodologycanbeconsideredasaneffectivestrategyforchangingtherepresentationofaclassi-ficationproblem.Indeed,Kusiak[38]considersdecompositionasthe“mostusefulformoftransformationofdatasets”.
Thedecompositionapproachisfrequentlyusedinstatistics,operationsresearchandengineering.Forinstance,decompositionoftimeseriesiscon-
2LiorRokach
sideredtobeapracticalwaytoimproveforecasting.Theusualdecompo-sitionintotrend,cycle,seasonalandirregularcomponentswasmotivatedmainlybybusinessanalysts,whowantedtogetaclearerpictureofthestateoftheeconomy[18].Althoughtheoperationsresearchcommunityhasextensivelystudieddecompositionmethodstoimprovecomputationaleffi-ciencyandrobustness,identificationofthepartitionedproblemmodelhaslargelyremainedanadhoctask[26].
Inengineeringdesign,problemdecompositionhasreceivedconsiderableattentionasameansofreducingmultidisciplinarydesigncycletimeandofstreamliningthedesignprocessbyadequatearrangementofthetasks[37].Decompositionmethodsarealsousedindecision-makingtheory.AtypicalexampleistheAHP(AnalyticHierarchyProcess)method[62].Inartificialintelligencefindingagooddecompositionisamajortactic,bothforensuringthetransparentend-productandforavoidingacombinatorialexplosion[45].
Researchhasshownthatnosinglelearningapproachisclearlysuperiorforallcases.Infact,thetaskofdiscoveringregularitiescanbemadeeasierandlesstimeconsumingbydecompositionofthetask.However,decom-positionmethodologyhasnotattractedasmuchattentioninthepatternrecognitionandmachinelearningcommunity[11].
Althoughdecompositionisapromisingtechniqueandpresentsanobvi-ouslynaturaldirectiontofollow,therearehardlyanyworksinthepatternrecognitionliteraturethatconsiderthesubjectdirectly.Instead,thereareabundantpracticalattemptstoapplydecompositionmethodologytospe-cific,reallifeapplications[11].Therearealsomanydiscussionsoncloselyrelatedproblems,largelyinthecontextofdistributedandparallellearning[71]orensemblesclassifiers.
Variousdecompositionmethodshavebeenpresented[38].Therewasalsosuggestiontodecomposetheexploratorydataanalysisprocessinto3parts:modelsearch,patternsearch,andattributesearch[7].However,inthiscasethenotionof“decomposition”referstotheentireprocess,whilethispaperfocusesondecompositionofthemodelsearch.
Intheneuralnetworkcommunity,severalresearchershaveexaminedthedecompositionmethodology[25].The“mixture-of-experts”(ME)methoddecomposestheinputspace,suchthateachexpertexaminesadifferentpartofthespace[46].However,thesub-spaceshavesoft“boundaries”,namelysub-spacesareallowedtooverlap.Eachexpertoutputstheconditionalprob-abilityofthetargetattributegiventheinputinstance.Agatingnetworkisresponsibleforcombiningthevariousexpertsbyassigningaweighttoeachnetwork.Theseweightsarenotconstantbutarefunctionsoftheinputinstancex.
Hierarchicalmixturesofexperts(HME)iswell-knownextensiontothebasicmixtureofexperts[32].Thisextensiondecomposesthespaceintosub-spaces,andthenrecursivelydecomposeseachsub-spacetosub-spaces.Variationofthebasicmixturesofexpertsmethodshavebeendevelopedtoaccommodatespecificdomainproblems.Aspecializedmodularnetwork
TitleSuppressedDuetoExcessiveLength3
calledtheMeta-pinetworkhasbeenusedtosolvethevowel-speakerproblem[24,48].TherehavebeenotherextensionstotheMEsuchasnonlineargatedexpertsfortime-series[70];revisedmodularnetworkforpredictingthesurvivalofAIDSpatients[47];andanewapproachforcombiningmultipleexpertsforimprovinghandwrittennumeralsrecognition[53].
Severaltaxonomiesfordecompositionmethodshavebeensuggestedintheliterature[38,66].However,thereisnoworkthatconsidersthecoexis-tenceofthesedifferentdecompositionmethodsinordertoanswerpracticalquestionssuchas:Whenshouldwepreferonedecompositionmethodovertheother?Isitpossibletosolveagivenproblemusingahybridizationofseveraldecompositionmethods?
Inourpreviouswork[57,43],wepresentedapreliminarytaxonomyfordecompositionofclassificationtasks.Inthispaper,wefirstextendthistaxonomyandsubsequentlysuggestasystematicwaysuchataxonomycanbeusedinpractice.Morespecificallythetaxonomyisusedasthebasisforanewdecisiontree-basedmeta-decomposerwhichaimstoautomaticallyidentifythebestdecompositionmethodforagivendatabase.2DecompositionAdvantages
Decompositionmethodscanimprovethepredictiveaccuracyofregularmethods.Infactinsomecasesimprovingperformanceisthemainmotiva-tionfordecomposition[66].Althoughthismightlooksurprisingatfirst,itcanbeexplainedbythebias-variancetradeoff.Sincedecompositionmethod-ologyconstructsseveralsimplersub-modelsinsteadasinglecomplicatedmodel,wemightgainbetterperformancebychoosingtheappropriatesub-models’complexities(i.e.findingthebestbias-variancetradeoff).Forin-stance,asingledecisiontreethatattemptstomodeltheentireinstancespaceusuallyhashighvarianceandsmallbias.Ontheotherhand,Na¨ıveBayescanbeseenasacompositeofsingle-attributedecisiontrees(eachoneofthesetreescontainsonlyoneuniqueinputattribute).ThebiasoftheNa¨ıveBayesislarge(asitcannotrepresentacomplicatedclassifier);ontheotherhand,itsvarianceissmall.Decompositioncanpotentiallyobtainasetofdecisiontrees,suchthateachoneofthetreesismorecomplicatedthanasingle-attributetree(thusitcanrepresentamorecomplicatedclassifierandithaslowerbiasthantheNa¨ıveBayes)butnotcomplicatedenoughtohavehighvariance.
Thereareotherjustificationsfortheperformanceimprovementofde-compositionmethods,suchastheabilitytoexploitthespecializedcapabil-itiesofeachcomponent,andconsequentlyachieveresultswhichwouldnotbepossibleinasinglemodel.Forinstancetheidentificationaccuracyofaclinicaldiagnosiscanbeimprovedwhentheproblemisdecomposedandtwoneuralnetworksaretrained[3].
Oneoftheexplicitchallengesoftheresearchcommunityistodevelopmethodsthatfacilitatetheuseofpatternrecognitionalgorithmsforreal-worldapplications.Intheinformationage,dataisautomaticallycollected
4LiorRokach
andthereforethedatabaseavailableforpatternsdiscoverycanbequitelarge,asaresultofanincreaseinthenumberofrecordsinthedatabaseandthenumberoffields/attributesineachrecord(highdimensionality).Therearemanyapproachesfordealingwithhugedatabasesincluding:samplingmethods;massivelyparallelprocessing;efficientstoragemethods;anddimensionreduction.Decompositionmethodologysuggestsanalterna-tivewaytodealwiththeaforementionedproblemsbyreducingthevolumeofdatatobeprocessedatatime.Decompositionmethodsbreaktheoriginalproblemintoseveralsub-problems,eachonewithrelativelysmalldimension-ality.Inthisway,decompositionreducestrainingtimeandmakesitpossibletoapplystandardpatternrecognitionalgorithmstolargedatabases[66].Decompositionmethodssuggestaconceptualsimplificationoftheorig-inalcomplexproblem.Insteadofgettingasingleandcomplicatedmodel,decompositionmethodscreateseveralsub-models,whicharemorecompre-hensible.Thismotivationhasoftenbeennotedintheliterature[49,28,66].Smallermodelsarealsomoreappropriateforuser-drivendataminingthatisbasedonvisualizationtechniques.Furthermore,ifthedecompositionstruc-tureisinducedbyautomaticmeans,itcanprovidenewinsightsabouttheexploreddomain.
Modularityeasesthemaintenanceoftheclassificationmodel.Sincenewdataisbeingcollectedallthetime,itisessentialonceinawhiletoexe-cutearebuildprocesstotheentiremodel.However,ifthemodelisbuiltfromseveralsub-models,andthenewdatacollectedaffectsonlypartofthesub-models,amoresimplere-buildingprocessmaybesufficient.Thisjustificationhasoftenbeennoted[38].
Iftherearenodependenciesbetweenthevarioussub-components,thenparalleltechniquescanbeapplied.Byusingparallelcomputation,thetimeneededtosolveaminingproblemcanbeshortened.
Decompositionmethodologysuggeststheabilitytousedifferentinducersforindividualsub-problemsoreventousethesameinducerbutwithadifferentsetup.Forinstance,itispossibletouseneuralnetworkshavingdifferenttopologies(differentnumberofhiddennodes).Theresearchercanexploitthisfreedomofchoicetoboostclassifierperformance.
Thefirstthreeadvantagesareofparticularimportanceincommercialandindustrialdatamining.However,asitwillbedemonstratedlater,notalldecompositionmethodsdisplaythesameadvantages.
3TheElementaryDecompositionTaxonomy
Findinganoptimalorquasi-optimaldecompositionforacertainsupervisedlearningproblemmightbehardorimpossible.Forthatreasontheelemen-tarydecompositionmethodologyhavebeenproposed[43].Thebasicideaistodevelopameta-algorithmthatrecursivelydecomposesaclassificationproblemusingelementarydecompositionmethods.Weusetheterm“ele-mentarydecomposition”todescribeatypeofsimpledecompositionthat
TitleSuppressedDuetoExcessiveLength5
canbeusedtobuildupamorecomplicateddecomposition.Givenacertainproblem,wefirstselectthemostappropriateelementarydecompositiontothatproblem.Asuitabledecomposerthendecomposestheproblem,andfi-nallyasimilarprocedureisperformedoneachsub-problem.Thisapproachagreeswiththe“nofreelunchtheorem”,namelyifonedecompositionisbet-terthananotherinsomedomains,thentherearenecessarilyotherdomainsinwhichthisrelationshipisreversed.
Forimplementingthisdecompositionmethodology,onemightconsiderthefollowingissues:
–Whattypeofelementarydecompositionmethodsexistforclassificationinducers?
–Whichelementarydecompositiontypeperformsbestforwhichproblem?Whatfactorsshouldonetakeintoaccountwhenchoosingtheappropri-atedecompositiontype?
–Givenanelementarytype,howshouldweinferthebestdecompositionstructureautomatically?
–Howshouldthesub-problemsbere-composedtorepresenttheoriginalconceptlearning?
–Howcanweutilizepriorknowledgeforimprovingdecomposingmethod-ology?Figure1suggestsananswertothefirstissue.Thisfigureillustratesanovelapproachforarrangingthedifferentelementarytypesofdecomposi-tioninsupervisedlearning.
Supervised learning decomposition
Original ConceptIntermediate ConceptTupleAttributeConceptAggregationFunctionDecomposition
SpaceSampleFig.1ElementaryDecompositionMethodsinClassification.
3.1IntermediateConceptDecomposition
Inintermediateconceptdecomposition,insteadofinducingasinglecompli-catedclassifier,severalsub-problemswithdifferentandmoresimplecon-
6LiorRokach
ceptsaredefined.Theintermediateconceptscanbebasedonanaggregationoftheoriginalconcept’svalues(conceptaggregation)ornot(functionde-composition).
Classicalconceptaggregationreplacestheoriginaltargetattributewithafunction,suchthatthedomainofthenewtargetattributeissmallerthantheoriginalone.
Conceptaggregationhasbeenusedtoclassifyfreetextdocumentsintopredefinedtopics[11].Thispapersuggestsbreakingthetopicsupintogroups(co-topics).Insteadofpredictingthedocument’stopicdirectly,thedocumentisfirstclassifiedintooneoftheco-topics.Anothermodelisthenusedtopredicttheactualtopicinthatco-topic.
TheError-CorrectingOutputCoding(ECOC)isageneralconceptag-gregationalgorithmwhichdecomposesmulti-classproblemsintomultiple,two-classproblems[15].Aclassifierisbuiltforeachpossiblebinaryparti-tionoftheclasses.ExperimentsshowthatECOCimprovestheaccuracyofneuralnetworksanddecisiontreesonseveralmulti-classproblemsfromtheUCIrepository.TheideatodecomposeaKclassclassificationprob-lemsintoKtwoclassclassificationproblems[2].Eachproblemconsidersthediscriminationofoneclasstotheotherclasses.Thelastmethodcanbeextendedformanipulatingthedatabasedontheclassrelationsamongtrainingdata[41].Byusingthismethod,theydivideaKclassclassificationproblemintoaseriesofK(K−1)/2two-classproblemswhereeachproblemconsidersthediscriminationofoneclasstoeachoneoftheotherclasses.Theyhaveexaminedthisideausingneuralnetworks.
Theround-robinclassificationproblem(pairwiseclassification)isatech-niqueforhandlingmulti-classproblems,inwhichoneclassifieriscon-structedforeachpairofclasses[20].Empiricalstudyhasshowedthatthismethodcanpotentiallyimproveclassificationaccuracy.
FunctiondecompositionwasoriginallydevelopedintheFiftiesandSix-tiesfordesigningswitchingcircuits.Itwasevenusedasanevaluationmech-anismforcheckerplayingprograms[63].Thisapproachwaslaterimproved[8].Recently,themachinelearningcommunityhasadoptedthisapproach.Amanualdecompositionoftheproblemandanexpert-assistedselectionofexamplestoconstructrulesfortheconceptsinthehierarchywasstud-ied[45].Incomparisonwithstandarddecisiontreeinductiontechniques,structuredinductionexhibitsaboutthesamedegreeofclassificationaccu-racywiththeincreasedtransparencyandlowercomplexityofthedevelopedmodels.
Ageneral-purposefunctiondecompositionapproachformachinelearn-inghasalsobeendeveloped[73].Accordingtothisapproach,attributesaretransformedintonewconceptsinaniterativemannerandcreateahierar-chyofconcepts.Itisalsopossibletouseadifferentfunctiondecompositionknownasbi-decomposition[40].
TitleSuppressedDuetoExcessiveLength7
3.2OriginalConceptDecomposition
OriginalConceptdecompositionmeansdividingtheoriginalproblemintoseveralsub-problemsbypartitioningthetrainingsetintosmallertrainingsets.Aclassifieristrainedoneachsub-sampleseekingtosolvetheoriginalproblem.Notethatthisresemblesensemblemethodologybutwiththefol-lowingdistinction:eachinducerusesonlyaportionoftheoriginaltrainingsetandignorestherest.Afteraclassifierisconstructedforeachportionseparately,themodelsarecombinedinsomefashion,eitheratlearningorclassificationtime.
Therearetwoobviouswaystobreakuptheoriginaldataset:tuple-orientedorattribute(feature)oriented.Tupledecompositionbyitselfcanbedividedintotwodifferenttypes:sampleandspace.Insampledecomposition(alsoknownaspartitioning),thegoalistopartitionthetrainingsetintoseveralsamplesets,suchthateachsub-learningtaskconsiderstheentirespace.
Inspacedecomposition,ontheotherhand,theoriginalinstancespaceisdividedintoseveralsub-spaces.Eachsub-spaceisconsideredindependentlyandthetotalmodelisa(possiblysoft)unionofsuchsimplermodels.
Spacedecompositionalsoincludesthedivideandconquerapproachessuchasmixturesofexperts,locallinearregression,CART/MARS,adaptivesubspacemodels,etc.,[31,32,54,27].
Featuresetdecomposition(alsoknownasattributesetdecomposition)generalizesthetaskoffeatureselectionwhichisextensivelyusedindatamining.Featureselectionaimstoprovidearepresentativesetoffeaturesfromwhichaclassifierisconstructed.Ontheotherhand,infeaturesetdecomposition,theoriginalfeaturesetisdecomposedintoseveralsubsets.Aninduceristraineduponthetrainingdataforeachsubsetindependently,andgeneratesaclassifierforeachone.Subsequently,anunlabelledinstanceisclassifiedbycombiningtheclassificationsofallclassifiers.Thismethodpotentiallyfacilitatesthecreationofaclassifierforhighdimensionalitydatasetsbecauseeachsub-classifiercopeswithonlyaprojectionoftheoriginalspace.
Intheliteraturethereareseveralworksthatfitthefeaturesetdecompo-sitionframework.However,inmostofthepapersthedecompositionstruc-turewasobtainedad-hocusingpriorknowledge.Moreover,itwasarguedthat:“Thereexistsnoalgorithmormethodsusceptibletoperformaverticalself-decompositionwithouta-prioriknowledgeofthetask!”[61].
ThefeaturesetdecompositionalgorithmknownasMFS(MultipleFea-tureSubsets)combinesmultiplenearestneighborclassifiers,eachusingonlyasubsetofrandomfeatures[4].ExperimentsshowMFScanimprovethestandardnearestneighborclassifiers.Thisprocedureresemblesthewell-knownbaggingalgorithm[10].However,insteadofsamplinginstanceswithreplacement,itsamplesfeatureswithoutreplacement.
Additionalalternativeistogroupthefeaturesaccordingtotheattributetype:nominalvaluefeatures,numericvaluefeaturesandtextvaluefeatures
8LiorRokach
[38].Asimilarapproachwasusedfordevelopingthelinear-bayesclassifier[21].Thebasicideaconsistsofaggregatingthefeaturesintotwosubsets:thefirstsubsetcontainingonlythenominalfeaturesandthesecondsubsetonlythecontinuousfeatures.
AnapproachforconstructinganensembleofclassifiersusingroughsettheorywaspresentedbyHu[29].AlthoughHu’sworkreferstoensemblemethodologyandnotdecompositionmethodology,itisstillrelevantforthiscase,especiallyasthedeclaredgoalwastoconstructanensemblesuchthatdifferentclassifiersusedifferentattributesasmuchaspossible.AccordingtoHu,diversifiedclassifiersleadtouncorrelatederrors,whichinturnimproveclassificationaccuracy.Themethodsearchesforasetofreducts,whichincludealltheindispensableattributes.Areductrepresentstheminimalsetofattributeswhichhasthesameclassificationpowerastheentireattributeset.
Thefeaturesetcanbedecomposedaccordingtothetargetclass[68].Foreachclass,thefeatureswithlowcorrelationrelatingtothatclasshavebeenremoved.Thismethodhasbeenappliedonafeaturesetof25sonarsignalswherethetargetwastoidentifythemeaningofthesound(whale,crackingice,etc.).
Featuresetdecompositionhasbeenusedforradarvolcanoesrecognition[14].Inthiscase,afeaturesetof119featureswasmanuallydecomposedinto8subsets.Featuresthatarebasedondifferentimageprocessingopera-tionsweregroupedtogether.Asaconsequence,foreachsubset,fourneuralnetworkswithdifferentsizeswerebuilt.
Thefeaturesetdecompositionwasprovedtobebeneficialinmanyotherapplications,suchastext-independentspeakeridentification[13],truckbacker-upperproblem[30]andqualityprobleminmanufacturingplants[42].
Theco-trainingparadigmforlearningwithlabelledandunlabelleddatacanbeconsideredasafeaturesetdecompositionforclassifyingWebpages[9].Co-trainingisusefulwhenthereisalargedatasample,ofwhichonlyasmallpartislabelled.Inmanyapplications,unlabelledexamplesaresig-nificantlyeasiertocollectthanlabelledones.Thisisespeciallytruewhenthelabellingprocessistime-consumingorexpensive,suchasinmedicalapplications.Accordingtotheco-trainingparadigm,theinputspaceisdi-videdintotwodifferentviews(i.e.twoindependentandredundantsetsoffeatures).Foreachview,adifferentclassifierisbuilttoclassifyunlabelleddata.Thenewlylabelleddataofeachclassifieristhenusedtoretraintheotherclassifier.Ithasbeenshown,bothempiricallyandtheoretically,thatunlabelleddatacanbeusedtoaugmentlabelleddata.
Anotheralternativefordecomposingthefeaturesetisasfollows[39]:Allinputfeaturesareinitiallygroupedbyusingahierarchicalclusteringalgorithmbasedonpairwisemutualinformation,withstatisticallysimilarfeaturesassignedtothesamegroup.Asaconsequence,severalfeaturesub-setsareconstructedbyselectingonefeaturefromeachgroup.Aneural
TitleSuppressedDuetoExcessiveLength9
networkissubsequentlyconstructedforeachsubset.Allnetworksarethencombined.
Inthestatisticsliterature,themostwell-knowndecompositionalgorithmistheMARSalgorithm[19].Inthisalgorithm,amultipleregressionfunctionisapproximatedusinglinearsplinesandtheirtensorproducts.IthasbeenshownthatthealgorithmperformsanANOVAdecomposition,namelytheregressionfunctionisrepresentedasagrandtotalofseveralsums.Thefirstsumisofallbasicfunctionsthatinvolveonlyasingleattribute.Thesecondsumisofallbasicfunctionsthatinvolveexactlytwoattributes,representing(ifpresent)two-variableinteractions.Similarly,thethirdsumrepresents(ifpresent)thecontributionsfromthree-variableinteractions,andsoon.
Otherworksonfeaturesetdecompositionhavebeendevelopedbyex-tendingtheNa¨ıveBayesclassifier.TheNa¨ıveBayesclassifier[16]usestheBayes’ruletocomputetheconditionalprobabilityofeachpossibleclass,assumingtheinputfeaturesareconditionallyindependentgiventhetar-getfeature.Duetotheconditionalindependenceassumption,thismethodiscalled“Na¨ıve”.Nevertheless,avarietyofempiricalresearchesshowsur-prisinglythattheNa¨ıveBayesclassifiercanperformquitewellcomparedtoothermethods,evenindomainswhereclearfeaturedependenciesexist[16].Furthermore,Na¨ıveBayesclassifiersarealsoverysimpleandeasytounderstand[36].
Recently,anewgeneralframeworkthatsearchesforhelpfulfeaturesetdecompositionstructureshasbeenproposed[58].Thisframeworknestsmanyalgorithms,twoofwhicharetestedempiricallyoverasetofbench-markdatasets.ThefirstalgorithmperformsaserialsearchwhileusinganewVapnik-Chervonenkisdimensionboundformultipleoblivioustreesasanevaluatingschema.Thesecondalgorithmperformsamulti-searchwhileusingwrapperevaluatingschema.Thisworkindicatesthatfeaturesetde-compositioncanincreasetheaccuracyofdecisiontrees.
4TheDecomposer’sCharacteristics
Thefollowingsub-sectionspresentthemainpropertiesthatcharacterizede-composers.Thesepropertiescanbeusefulfordifferentiatingbetweenvariousdecompositionframeworks.
4.1TheStructureAcquiringMethod
Thisimportantpropertyindicateshowthedecompositionstructureisob-tained:
–Manually(explicitly)basedonanexpert’sknowledgeinaspecificdo-main[9,45].Iftheoriginofthedatasetisarelationaldatabase,thentheschema’sstructuremayimplythedecompositionstructure.
10LiorRokach
–Predefinedduetosomerestrictions(asinthecaseofdistributeddatamining)
–Arbitrarily[17,12]-Thedecompositionisperformedwithoutanypro-foundthought.Usually,aftersettingthesizeofthesubsets,membersarerandomlyassignedtothedifferentsubsets.
–Inducedwithouthumaninteractionbyasuitablealgorithm[73].Somemayjustifiablyclaimthatsearchingforthebestdecompositionmightbetime-consuming,namelyprolongingtheinductionprocess.Inor-dertoavoidthisdisadvantage,thecomplexityofthedecompositionalgo-rithmsshouldbekeptassmallaspossible.However,evenifthiscannotbeaccomplished,therearestillimportantadvantages,suchasbettercom-prehensibilityandbetterperformancethatmakesdecompositionworththeadditionalcomputationalcomplexity.
Furthermore,itshouldbenotedthatinanongoinginductioneffort(likeinachurningapplication)searchingforthebestdecompositionstructuremightbeperformedinwidertimebuckets(forinstance,onceayear)thanwhentrainingtheclassifiers(forinstanceonceaweek).Moreover,foracquir-ingdecompositionstructure,onlyarelativelysmallsampleofthetrainingsetmayberequired.Consequently,theexecutiontimeofthedecomposerwillberelativelysmallcomparedtothetimeneededtotraintheclassifiers.Usuallyinreal-lifeapplicationsthedecompositionisperformedmanu-allybyincorporatingbusinessinformationintothemodellingprocess.Thefollowingquotationprovidesapracticalexample[6]:
Itmaybeknownthatplatinumcardholdersbehavedifferentlyfromgoldcardholders.Insteadofhavingadataminingtechniquefigurethisout,giveitthehintbybuildingseparatemodelsfortheplatinumandgoldcardholders.
Decompositioncanbealsousefulforhandlingmissingdata.Inthiscasewedonotrefertosporadicmissingdatabuttothecasewhereseveralat-tributevaluesareavailableforsometuplesbutnotforallofthem.Forinstance:“Historicaldata,suchasbillinginformation,isavailableonlyforcustomerswhohavebeenaroundforasufficientlylongtime”or“Outsidedata,suchasdemographics,isavailableonlyforthesubsetofthecustomerbasethatmatches”).Inthiscase,oneclassifiercanbetrainedforcus-tomershavingalltheinformationandasecondclassifierfortheremainingcustomers[6].
4.2TheMutuallyExclusiveProperty
Thispropertyindicateswhetherthedecompositionismutuallyexclusive(disjointeddecomposition)orpartiallyoverlapping(i.e.acertainvalueofacertainattributeinacertaintupleisutilizedmorethanonce).Forinstance,inthecaseofsampledecomposition,“mutuallyexclusive”meansthatacertaintuplecannotbelongtomorethanonesubset[17,12].Otherhave
TitleSuppressedDuetoExcessiveLength11
usednon-exclusivefeaturedecomposition[4].SimilarlyCARTandMARSperformmutuallyexclusivedecompositionoftheinputspace,whileHMEallowssub-spacestooverlap.
Thepartiallyoverlappingdecompositionsarepotentiallymoreaccuratethanmutuallyexclusivedecompositions,becausethelatterformsarestric-tionontheproblemspacewhichmightskiponaccuratemodels.Stillmu-tuallyexclusivedecompositionhassomeimportantandhelpfulproperties:–Agreatertendencyinreductionofexecutiontimethannon-exclusiveapproaches.Sincemostlearningalgorithmshavecomputationalcom-plexitythatisgreaterthanlinearinthenumberofattributesortuples,partitioningtheproblemdimensionalityinamutuallyexclusivemannermeansadecreaseincomputationalcomplexity[51].
–Sincemutualexclusivenessentailsusingsmallerdatasets,themodelsobtainedforeachsub-problemaresmallerinsize.Withoutthemutuallyexclusiverestriction,eachmodelcanbeascomplicatedasthemodelobtainedfortheoriginalproblem.Smallermodelscontributetocompre-hensibilityandeaseinmaintainingthesolution.
–Mutuallyexclusivedecompositionmayhelpavoidsomeerrorcorrela-tionproblemsthatcharacterizenon-mutuallyexclusivedecompositions[4].However,mutuallyexclusivetrainingsetsdonotnecessarilyresultinlowerrorcorrelation[66].Thispointistruewheneachsub-problemisrepresentative(i.e.representtheentireproblem,asinsampledecom-position).
–Reducedtendencytocontradictionbetweensub-models.Whenamutu-allyexclusiverestrictionisunenforced,differentmodelsmightgeneratecontradictiveclassificationsusingthesameinput.Reducinginter-modelscontraindicationshelpustograsptheresultsandtocombinethesub-modelsintoonemodel.Theresultingpredictionsofensemblemethodsareusuallyinscrutabletoend-users,mainlyduetothecomplexityofthegeneratedmodels,aswellastheobstaclesintransformingthesesmodelsintoasinglemodel[56].Moreover,sincethesemethodsdonotattempttouseallrelevantfeatures,theresearcherwillnotobtainacompletepic-tureofwhichattributeactuallyaffectsthetargetattribute,especiallywhen,insomecases,therearemanyrelevantattributes.
–Sincethemutuallyexclusiveapproachencouragessmallerdatasets,theyaremorefeasible.Someinducerscanprocessonlylimiteddatasetsize(forinstancewhentheprogramrequiresthattheentiredatasetwillbestoredinthemainmemory).Themutuallyexclusiveapproachcanmakecertainthatinducersarefairlyscalabletolargedatasets[12,51].
–Weclaimthatend-userscangraspmutuallyexclusivedecompositionmucheasierthanmanyothermethodscurrentlyinuse.Forinstance,boosting,whichisawell-knownensemblemethod,distortstheoriginaldistributionofinstancespace,afactthatnon-professionalusersfindhardtograsporunderstand.
12LiorRokach
4.3TheInducerUsage
Thispropertyindicatestherelationbetweenthedecomposerandthein-ducerused.Somedecompositionimplementationsare“inducer-free”,namelytheydonotuseintrinsicinducersatall.Usuallythedecompositionpro-cedureneedstochoosethebestdecompositionstructureamongseveralstructuresthatitconsiders.Inordertomeasuretheperformanceofacer-taindecompositionstructure,thereisaneedtorealizethestructurebybuildingaclassifierforeachcomponent.Howeversince“inducer-free”de-compositiondoesnotuseanyinductionalgorithm,itusesafrequencytableoftheCartesianproductofthefeaturevaluesinstead.Considerthefol-lowingexample.Thetrainingsetconsistsoffourbinaryinputattributes(a1,a2,a3,a4)andonetargetattribute(y).Assumethatan“inducer-free”decompositionprocedureexaminesthefollowingfeaturesetdecomposition:(a1,a3)and(a2,a4).Inordertomeasuretheclassificationperformanceofthisstructure,itisrequiredtobuildtwoclassifiers;oneclassifierforeachsubset.Intheabsenceofaninductionalgorithm,twofrequencytablesarebuilt;eachtablehas22=4entriesrepresentingtheCartesianproductoftheattributesineachsubset.Foreachentryinthetable,wemeasurethefrequencyofthetargetattribute.Eachoneofthetablescanbeseparatelyusedtoclassifyanewinstancex:wesearchfortheentrythatcorrespondstotheinstancexandselectthetargetvaluewiththehighestfrequencyinthatentry.This“inducer-free”strategyhasbeenusedinseveralplaces.ForinstancetheextensionofNa¨ıveBayessuggestedcanbeconsideredasafeaturesetdecompositionwithnointrinsicinducer[16].Thefunctionde-compositionalgorithmdevelopedbyusingsparsefrequencytablesalsofitsthisstrategy[73].
Otherimplementationsareconsideredasan“inducer-dependent”type,namelythesedecompositionmethodsuseintrinsicinducers,andtheyhavebeendevelopedspecificallyforacertaininducer.Theydonotguaranteeeffectivenessinanyotherinductionmethod.Forinstance,someworkshavebeendevelopedspecificallyforneuralnetworks[41]ordecisiontrees[58].Thethirdtypeofdecompositionmethodisthe“inducer-independent”type.Theseimplementationscanbeperformedonanygiveninducer,how-ever,thesameinducerisusedinallsubsets.Asopposedtothe“inducer-free”implementation,whichdoesnotuseanyinducerforitsexecution,“inducer-independent”requirestheuseofaninducer.Nevertheless,itisnotlimitedtoaspecificinducerlikethe“inducer-dependent”.
Thelasttypeisthe“inducer-chooser”type,which,givenasetofinduc-ers,thesystemusesthemostappropriateinduceroneachsub-problem.4.4Exhaustiveness
Thispropertyindicateswhetheralldataelementsshouldbeusedinthedecomposition.Forinstance,anexhaustivefeaturesetdecompositionreferstothesituationinwhicheachfeatureparticipatesinatleastonesubset.
TitleSuppressedDuetoExcessiveLength13
4.5CombinerUsage
Thispropertyspecifiestherelationbetweenthedecomposerandthecom-biner.Somedecomposersarecombiner-dependent.ThatistosaytheyhavebeendevelopedspecificallyforacertaincombinationmethodlikevotingorNa¨ıveBayes.Otherdecomposersarecombiner-independent;thecombina-tionmethodisprovidedasinputtotheframework.Potentiallytherecouldbedecomposersthat,givenasetofcombiners,wouldbecapableofchoosingthebestcombinerinthecurrentcase.4.6SequentiallyorConcurrently
Thispropertyindicateswhetherthevarioussub-classifiersarebuiltsequen-tiallyorconcurrently.Insequentialframeworktheoutcomeofacertainclassifiermayeffectthecreationofthenextclassifier.Ontheotherhand,inconcurrentframeworkeachclassifierisbuiltindependentlyandtheirresultsarecombinedinsomefashion.Somereferstothispropertyas“Therelation-shipbetweenmodules”[65]anddistinguishesbetweenthreedifferenttypes:successive,cooperativeandsupervisory.Roughlyspeakingthe“successive”refersto“sequential”while“cooperative”refersto“concurrent”.Thelasttypeappliestothecaseinwhichonemodelcontrolstheothermodel,forinstance,oneneuralnetworkisusedtotuneanotherneuralnetwork.
Theoriginalprobleminintermediateconceptdecompositionisusuallyconvertedtoasequentiallistofproblems,wherethelastproblemaimstosolvetheoriginalone.Ontheotherhand,inoriginalconceptdecompositiontheproblemisusuallydividedintoseveralsub-problemswhichexistontheirown.Nevertheless,therearesomeexceptions.Forinstance,the“windowing”concept[52]isconsideredtobesequential.
Naturallytheremightbeotherimportantpropertieswhichcanbeusedtodifferentiateadecompositionscheme.Table1summarizesthemostrel-evantresearchperformedoneachdecompositiontype.
Table1SummaryofDecompositionMethodsintheLiterature.Paper[2][11][45][73][1][17][59][54][35][4][38]
DecompositionTypeConceptConceptFunctionFunctionSampleSampleSampleSpaceSpaceAttributeAttribute
MutuallyExclusiveNoYesYesYesNoYesYesNoYesNoYes
StructureAcquiringMethodArbitrarilyManuallyManuallyInducedArbitrarilyArbitrarilyInducedInducedInducedArbitrarilyManually
14LiorRokach
5MetaDecomposerFramework
Asstatedabove,ourultimategoalistodevelopamechanismthatcom-binesalldecompositionmethodssuchthatgivenanunseendataset;themostappropriatedecomposition(ifany)couldbeselected.Therearetwoalternativestoachievethisautomaticandsystematicdecompositionproce-dure:
–Thewrapperapproach–Givenacertaindataset,useeachelementarydecompositionandselecttheonethatappearstogivethehighestsuccessrate.Themainadvantageofthisapproachisitsabilitytopredictquitewelltheperformanceofeachexaminedmethod.Themaindisadvantageofthismethodisit’sprolongedprocessingtime.Forsomeinducerstheinductiontimesmaybeverylong,particularlyinlargereal-lifedatasets.Severalresearchershaveimplementedthismethodforselectinginduc-tionalgorithmsordimensionreductionalgorithmsandshowedthatitproducessuperiorresults[,34].
–Themeta-learningapproach–Basedondatasetscharacteristics,themeta-decomposerdecideswhethertodecomposetheproblemornotandwhatelementarydecompositiontouse.Theideaofthemeta-decomposerapproachcanbesummarizedasfollows:Ifacertaindecompositionmethodoutperformsotherdecompositionmethodsinaparticulardataset,thenoneshouldexpectthatthismethodwillbepreferablewhenotherproblemswithsimilarcharacteristicsarepresented.Forthispurposeonecanemploymeta-learning.Meta-learningisconcernedwithaccu-mulatingexperienceontheperformanceofmultipleapplicationsofalearningsystem.Onepossibleoutputofthemeta-learningprocessisameta-classifierthatiscapabletoindicatewhichlearningmethodismostappropriatetoagivenproblem.Inthispaperthemeta-learningfocusesonexplainingwhatcausesadecompositionmethodtobesuccessfulornotinaparticularproblem.Thus,inthiscasethemeta-classifierisusedameta-decomposerwhichattemptstoselectthemostappropriatedecompositionmethod.Thisgoalcanbeaccomplishedbyperformingthefollowingphases:Inthefirstphaseoneshouldexaminetheperfor-manceofallinvestigateddecompositionmethodsonvariousdatasets.Uponexaminationofeachdataset,thecharacteristicsofthedatasetareextracted.Thedataset’scharacteristics,togetherwiththeindicationofthemostpreferabledecompositionmethod,(inthisdataset)arestoredinameta-dataset.Thismeta-datasetreflectstheexperienceaccumu-latedacrossdifferentdatasets.Inthesecondphase,aninducercanbeappliedtothismeta-datasettoinduceameta-decomposerthatcanmapadatasettothemostappropriatedecompositionmethod(basedonthecharacteristicsofthedataset).Inthelastphase,themeta-decomposerisactuallyusedtomatchanewunseendatasettothemostappropriatedecompositionmethod.Thispaperadoptsthesecondalternativeandexaminesitonrealworldproblems.Previousworkshavealreadyconsideredthisapproachforselect-
TitleSuppressedDuetoExcessiveLength15
ingthemostappropriateinductionalgorithmgivendatasetcharacteristics(seeforinstance[22,69,5]).However,applyingthismethodologyforselect-ingthemostappropriatedecompositiongivenacertaindataset,hasnotyetbeenconsidered.Themaindisadvantagesofthemeta-learningprocessconcerntheassumptionthatdatasetswithsimilarpropertiesbehavethesame.Furthermore,inmeta-learning,theamountofdataavailable(datasetdescriptionsanddifferentperformances)isusuallyquitesmall,thusthemeta-decomposerisbasedonsmallmetadatasets.Nevertheless,themainadvantageofthisapproachisthatafterthemeta-learningphaseiscom-pleted,itcanbeusedtoselectthebestdecompositioninnegligibleprocess-ingtime.
Figures2and3presenttheschematicframeworkofthemeta-decomposer.Figure2presentsthemeta-datagenerationphase.Figure3presents(A)themeta-learningphaseand(B)theusageofthemeta-decomposer.AsitcanbeseeninFigure2,theDataset-Generatorcomponentisresponsibletoextendtheoriginaldatasetsrepositoryintoamuchbiggerrepositorybymanipulatingtheoriginaldatasets.
OriginalDatasetsRepositoryDatasetGeneratorDecomposersEvaluatorManipulatedDatasetsRepositoryDatasetCharacterizerMeta-DataFig.2Meta-DataGenerationPhase.
5.1DatasetCharacterizer
Itappearsthatdatasetscanbedescribedusingavectorofnumericvaluesusingcertainfeatures.Itispossibletocategorizethecharacteristicmeasuresintothefollowingtypes[22]:
–Simplemeasures(e.g.numberofattributes,numberofclasses,propor-tionofbinary,categoricalorordinalattributes).
16LiorRokach
Meta-DataNewDatasetsInducerDatasetCharacterizerMeta-DecomposerAMeta-DecomposerWhat DecomposerShould be Used?BFig.3(A)MetaInductionPhaseFigureand(B)Meta-DecomposerUsage.
–Statisticalmeasures(e.g.standarddeviationratio).–Informationbasedmeasures(e.g.meanentropy).TheMetaattributesusedhereare:
1.Numberofinstancesinthetrainingdataset.2.Numberofattributes.
3.Ratioofnumberofinstancestothenumberofattributes-Potentialforoverfitting.Ifthisvalueissmall,inducersmayfindaclassifierthatadaptstoowelltothespecificitiesofthetrainingdata,whichmaybecausedbynoisyorirrelevantattributes,andthusresultinpoorgener-alizationperformance.
4.Numberofclasses—Thedomainsizeofthetargetattribute.5.Proportionofthebinaryattributes.6.Proportionofthecategoricalattributes.7.Proportionoftheordinalattributes.
8.Defaultaccuracy—accuracyobtainedwhenusingthemostfrequentclassinthetrainingset,withoutbuildinganyclassifier.
9.Meanofmeans—themeanofallnumericalattributesmeans.
10.Meanofstandarddeviation—themeanofallnumericalattributes
standarddeviations.
TitleSuppressedDuetoExcessiveLength17
11.MeanKurtosis-Themeanofallnumericalattributeskurtosismeasure.
Kurtosisisameasureofwhetherthedataarepeakedorflatrelativetoanormaldistribution.
12.MeanSkewness—Themeanofallnumericalattributesskewnessmea-sure.Skewnessisameasureofsymmetry,ormoreprecisely,thelackofsymmetry.
13.Meanofentropies—meanofallattributessimpleentropy.Entropy
measurestheamountofuncertaintyofarandomvariable.TheShanonentropymeasureresemblesinmanywaystotheclassicvariancemea-surement.Bothofthemaremeasuresofquantifyinguncertaintychanges.However,entropyisdifferentfromvariancebyitsmetric-freenature:Itisdependentonlyontheprobabilitydistributionofarandomvariableandnotonitsvalues.However,theentropymeasureisnotexpressiveasmuchasthecumulativeeffectofallstatisticalmeasures.Forthispurposeitispossibletouseoneofthegeneralizedentropymeasuresavailableintheliterature.Forthispurposewecanusetwogeneralizationsproposedintheliterature:Renyi[55]andTsallis[67].
14.Averageabsolutecorrelationbetweennumericattributes:indicatero-bustnesstoirrelevantattributes.
15.Proportionofnumericattributeswithoutliers:Indicaterobustnessto
outlyingvalues.Acertainattributeisconsideredtohaveoutliersiftheratioofthevariancesofmeanvalueandtheα-trimmedmean(whereα=0.05)issmallerthan0.7.
16.AverageGainRatio—Averageinformationgainratioofthetarget
attributeobtainedbysplittingthedataaccordingtoeachattribute.Usefulasanindicativetotheamountofrelevantinformationembodiedintheinputattributes.
5.2DatasetManipulator
Asstatedaboveoneofthemaindrawbacksofthemeta-learningmethodol-ogyisthenecessitytoinducefromaverylimitedmeta-dataset.ThepurposeoftheDatasetManipulatorcomponentistoextendtheoriginalrepositoryofdatasetsintoamuchbiggerrepositoryandbythatovercomingthelimitedmeta-datasetproblem.
Obviouslythemanipulationoperatorsshouldefficientlyaffectthedatasetcharacteristicsinordertoexplorenewoptionsthatarenotrepresentedintheoriginalrepository.Thefollowingsimpleoperatorscanbesuitabletothistask:
–Projection–Randomlychooseasubsetoftheoriginalinputattributessetandprojectthedataaccordingtoit.Thisoperationcanhavedifferentlevelsofmanipulationbysettingaparameterthatrepresentsthesubsetsizeasaportionoftheoriginalattributeset.Notethatthisoperationisdisablediftheparameterissetto100%.
18LiorRokach
–Selection–Randomlyselectasub-sampleoftheoriginaldataset.Thisoperationcanhavedifferentlevelsofmanipulationbysettingaparam-eterthatrepresentsthesub-samplesizeasaportionoftheoriginaldataset.Notethatthisoperationisdisablediftheparameterissetto100%.
–Distortion–Changingthedistributionofthetargetattributebyreas-signingsomeoftheinstancestoadifferentclass.Thisoperationcanhavedifferentlevelsofmanipulationbysettingaparameterthatrep-resentstheportionofinstancesthatremainuntouched.Notethatthisoperationisdisablediftheparameterissetto100%.Eachmanipulateddatasetisobtainedbyperformingthesethreeopera-tions.Notethatthereisnomeaningtotheorderofmanipulativeoperations.5.3DecomposerEvaluator
Theaimofthiscomponentistoevaluatetheperformanceofeachdecompo-sitionmethodoneachdatasetinthemanipulatedrepository.InthispaperweusetheC4.5algorithm[52]astheinternalinducer.Furthermore,wechecktheperformanceofC4.5withoutperforminganydecompositionatall.Eachmanipulateddatasetisrepresentedinthemeta-datasetasasin-gletuple,whereitstargetvalueischosentobethemethod’snamehavingthehighestperformance(fromaccuracyperspective)forthisdataset.Theperformanceisevaluatedbyaveragingtheresultsobtainedby10-fold-cross-validationprocedurerepeated5times.6ExperimentalStudy
Inthissectionweexaminethesuggestedmeta-decompositionapproach.Forthispurposeweuse25datasetsfromtheUCIrepository[44].
Foreachmanipulationoperationwehaveexaminedfourdifferentlevelsofmanipulation:100%,90%,75%and50%.Thisresultsindifferentcom-binations(manipulateddatasets)foreverydatasetintheoriginalrepository.Becausetheoriginaldatasetrepositorycontains25datasets,themanipu-latedrepositoryconsistsof1600datasets.
Inthispaperwecompareonlythreedecompositionmethods:FeaturesetdecompositionusingDOGalgorithm[58],SpaceDecompositionusingK-Classifieralgorithm[60]andSampleDecompositionusingClusterBasedConcurrentalgorithm[59].Allalgorithmswereexecutedusingtheirdefaultparameters.Obviouslythereareseverallimitationsinthemethodologypre-sentedabove.First,theresultscouldbealteredifthealgorithms’parametersaretuneddifferently.Second,thereisnoguaranteethatthesealgorithmspreciselyrepresentthecorrespondeddecompositionmethod,namelyifdif-ferentdecompositionalgorithmshadbeenemployedheretheresultscouldbedifferent.
TitleSuppressedDuetoExcessiveLength19
Figure4presentsthedecisiontreeobtainedbyemployingtheC4.5algorithmonmeta-data(thetreehasbeenslightlymodifiedforlegibil-ity).Itcanbeseenthatspacedecompositionisusefulwhentherearenumericattributes.ThismakessenseastheemployedalgorithmusestheK-Meansalgorithmwhichismoresuitabletonumericinstancespace.TheInstance-AttributeRatioMetaattributeisusedtodifferentiatebetweenAT-TRIBUTE(AttributeDecomposition)andNONE(notperformingdecom-positionatall)inthesecondandthirdleaves.Inthiscase,iftheInstance-AttributeRatioisbelowa78.77(namelytherearemanyattributesrela-tivelytothedatasetsize),thenattributedecompositionshouldbeapplied.AnotherinterestingobservationisthatbothMeanEntropy2andMeanTsal-lisEntropy2(Tsallis’EntropyMeasure)arefoundtoberelevant.Thisindi-catesthatthenewproposedmeasureisnotredundant.
numprop<=0
MeanEntropy2<=0.03−→ATTRIBUTE(362.0/57.0)MeanEntropy2>0.03
Catprop<=0.67
MeanTsallisEntropy2<=0.57
Instance-AttributeRatio<=78.77
−→ATTRIBUTE(336.0/135.0)
Instance-AttributeRatio>78.77
−→NONE(213.0/56.0)
MeanTsallisEntropy2>0.57
−→SAMPLE(425.0/235.0)
Catprop>0.67−→NONE(328.0)
numcount>0−→SPACE(280.0/151)Fig.4Meta-DecomposerDecisionTree.
6.1EvaluatingtheMeta-Decomposer
Toevaluatewhethermeta-decomposerbringssomebenefits,wehavecarriedouttheleave-one-outprocedure.Accordingtothisprocedurethemeta-decomposerhasbeengeneratedbylearningfromapartialmeta-datasetobtainedbyremovingoneoriginaldataset(andallitsderiveddatasets).Thentheobtainedmeta-decomposerhasbeenevaluatedonthedatasetthathasbeenleftout.Thisprocedurehasbeenrepeated25times,eachtimeleavingoutadifferentdataset.Table2presentstheobtainedresults.Thefirstcolumnshowsthedatasetname.Thesecondandthirdcolumnscorrespondinglypresenttheactualbestmethodandthemethodanticipated
20LiorRokach
bythemeta-decomposertobethemostsuitablemethod.Thefourth,fifthandsixthcolumnspresenttheaccuracyobtainedusingC4.5,theactualbestmethodandtheanticipatedmethodrespectively.Asitcanbeseeninalmosttwothirdsofthedatasets,themeta-decomposerwasabletopredictwhichdecompositionmethod(ifatall)outperformsothermethods.Intheremainingcasestheperformanceoftheanticipatedmethodhasbeenslightlylessaccuratethantheactualbestmethodwithnostatisticalsignificance.Moreover,employingthemeta-decomposertogetherwithitscorrespondeddecompositionmethodcanimproveonaveragetheaccuracyofC4.5in6.86%±3.65%(whilethebestimprovementthatmightbeobtainedbyselectingthemostsuitabledecompositionmethodis7.07%±3.%).
Table2PerformanceResultsofMeta-DecomposerProcedure.DatasetName
ActualBestMethodAttributeAttributeAttributeAttributeNoneSpaceAttributeAttributeSpaceAttributeAttributeSampleAttributeNoneNoneAttributeAttributeSpaceAttributeNoneNoneAttributeAttributeSampleAttributeAttribute
AnticipatedAccuracyBestC4.5MethodNoneATTNoneAttributeNoneNoneAttributeAttributeSpaceAttributeAttributeSampleNoneAttributeSpaceAttributeAttributeAttributeAttributeSpaceNoneAttributeAttributeNoneAttributeAttribute
85.36±5.175±6.9592.99±2.8770.32±8.4696±3.3399.44±0.5587.72±12.7259.09±6.974.96±0.838.71±17.8275.81±8.261.54±8.693.44±3.7100±097.45±0.462.42±269.71±5.492.83±1.5291.2±1.985.7±1.6596.21±2.4585.96±6.993.07±5.875±9.278.33±3.669.8±4.7
AccuracyActualBestMethod86.52±2.578.5±6.5497.42±1.1780±6.95.33±5.0599.62±0.4398.24±4.5273.±5.577.46±0.93.55±10.0598.39±2.365.36±5.793.442±3.3100±090.65±0.691.73±1.477.12±8.794.9±4.6195.8±0.979.33±490.52±1.2396.63±3.998.02±3.0279±8.1.33±2.771.4±2.6
AccuracyAnticipatedBestMethod85.36±5.178.5±6.5492.99±2.8780±6.96±3.3399.44±0.5598.24±4.5273.±5.577.46±0.93.55±10.0598.39±2.365.36±5.793.44±3.7100±090.65±0.691.73±1.477.12±8.792.9±2.5695.8±0.979.33±496.21±2.4596.63±3.998.02±3.0275±9.2.33±2.771.4±2.6
AUST
AUDIOLOGYBCAN
HEPATITISIRIS
KR-VS-KPLABORLED17LETTERLCANMONKS1MONKS2MONKS3MUSHNURSEOPTICSONARSOYBEANSPITTTVOTEWINEZOO
ARCENEDEXTERMADELON
TitleSuppressedDuetoExcessiveLength21
7TheRelationtoOtherMethodologies
Themaindistinctionbetweenexistingapproaches,suchasensemblemeth-odsanddistributeddataminingtodecompositionmethodology,focusesonthefollowingfact:theassumptionthateachmodelhasaccesstoacom-parablequalityofdataisnotvalidinthedecompositionapproach[68].Indecompositionmethodologyclassifiersmayhavesignificantvariationsintheiroverallperformance.Furthermorewhenindividualclassifiershavesubstantiallydifferentperformancesoverdifferentpartsoftheinputspace,combiningisstilldesirable[68].Neverthelessneithersimplecombinersnormoresophisticatedcombinersareparticularlywell-suitedforthetypeofproblemsthatarise.
Theensemblemethodologyiscloselyrelatedtothedecompositionmethodology.Inbothcasesthefinalmodelisacompositeofmultiplemod-elscombinedinsomefashion.However,itispossibletodistinguishbetweenthesemethodologiesinthefollowingway[65]:themainideaofensemblemethodologyistocombineasetofmodels,eachofwhichsolvesthesameoriginaltask.Thepurposeofensemblemethodologyistoobtainamoreaccurateandreliableperformancethanwhenusingasinglemodel.Ontheotherhand,thepurposeofdecompositionmethodologyistobreakdownacomplexproblemintoseveralmanageableproblems,enablingeachinducertosolveadifferenttask.Therefore,inensemblemethodology,anymodelcanprovideasufficientsolutiontotheoriginaltask.Ontheotherhand,indecompositionmethodology,acombinationofallmodelsismandatoryforobtainingareliablesolution.
Distributeddatamining(DDM)dealswithminingdatathatmightbeinherentlydistributedamongdifferent,looselycoupledsiteswithslowcon-nectivity,suchasgeographicallydistributedsitesconnectedovertheInter-net[33].UsuallyDDMiscategorizedaccordingtodatadistribution:–Homogeneous–Inthiscase,thedatasetsinallthesitesarebuiltfromthesamecommonsetofattributes.Thisstateisequivalenttothesampledecompositiondiscussedabove,whenthedecompositionstructureissetbytheenvironment.
–Heterogeneous–Inthiscase,thequalityandquantityofdataavailabletoeachsitemayvarysubstantially.Sinceeachspecificsitemaycon-taindatafordifferentattributes,leadingtolargediscrepanciesintheirperformance,integratingclassificationmodelsderivedfromdistinctanddistributeddatabasesiscomplex.DDMcanbeusefulalsointhecaseof“mergersandacquisitions”ofcorporations.Insuchcases,sinceeachcompanyinvolvedmayhaveitsownITlegacysystems,differentsetsofdataareavailable.
InDDMthedifferentsourcesaregiven,namelytheinstancesarepre-decomposed.Asaresult,DDMismainlyfocusedoncombiningthevariousmethods.Severalresearchersdiscusswaysofleveragingdistributedtech-niquesinknowledgediscovery,suchasdatacleaningandpreprocessing,transformation,andlearning.
22LiorRokach
TheJAMsystemisameta-learningapproach[50].Themeta-learningapproachisaboutcombiningseveralmodels(describingseveralsetsofdatafromseveralsourcesofdata)intoonehigh-levelmodel.
Anothermeta-learningconceptistheknowledgeprobing[23].Inknowl-edgeprobing,supervisedlearningisorganizedintotwostages.Inthefirststage,asetofbaseclassifiersisconstructedusingthedistributeddatasets.Inthesecondstage,therelationshipbetweenanattributevectorandtheclasspredictionsfromallofthebaseclassifiersisdetermined.
AcloselyrelatedfieldisParallelDataMining(PDM).PDMdealswithminingdatabyusingseveraltightly-coupledsystemswithfastinterconnec-tion,asinthecaseofaclusterofsharedmemoryworkstations[71].
ThemaingoalofPDMtechniquesistoscale-upthespeedofthedataminingonlargedatasets.Itaddressestheissuebyusinghighperformance,multi-processorcomputers.Theincreasingavailabilityofsuchcomputerscallsforextensivedevelopmentofdataanalysisalgorithmsthatcanscaleupasweattempttoanalyzedatasetsmeasuredinterabytesonparallelma-chineswiththousandsofprocessors.Thistechnologyisparticularlysuitableforapplicationsthattypicallydealwithlargeamountsofdata,e.g.companytransactiondata,scientificsimulationandobservationdata.Anotherim-portantexampleofPDMistheSPIDERprojectthatusesshared-memorymultiprocessorssystems(SMPs)toaccomplishPDMondistributeddatasets[72].
8ConclusionandFutureWork
Inthispaperwehavereviewedthenecessityofdecompositionmethodol-ogyinpatternrecognition,machinelearninganddatamining.Wehavesuggestedanapproachtocategorizeelementarydecompositionmethods.Wealsodiscussedthemaincharacteristicsofdecompositionmethodsanddemonstratedthesecharacteristicsonvariousmethodspresentedinthelit-erature.
Finallyweproposedameta-decompositionapproachandvalidateditspredictioncapabilities.Wehaveshownempiricallythattheproposedmeta-decomposerusuallyselectthemostappropriatedecompositionmethod.Infactusingthemeta-decomposerachievedanaccuracyimprovementof6.8%whileusingtheposterioribestmethodhasprovidedonlyslightlybetterresults(7.1%).
Additionalresearchissuesinmetadecomposerapproachinclude:Ex-tendingthemeta-learningschematoinvestigateotherdecompositionmeth-odspresentedinSection3,moreprecisely:FunctionDecompositionandConceptAggregation;Checkingwhetherthemeta-learningresultsarestillvalidwhendifferentdecompositionalgorithmsimplementationsareused,namelyexaminetherobustnessofthemeta-decomposer;Examinetheef-fectivenessofrecursivelyusingthemeta-decomposer;andfinallyhowcanweutilizepriorknowledgefordecompositionmethodology.
TitleSuppressedDuetoExcessiveLength23
9OriginalityandContribution
Thispaperintroducesanoveltaxonomyfordecompositionmethodsasap-pliedtoclassificationtasks.Theproposedtaxonomyreferstoelementarydecompositionmethodsthatcanbeusedasbuildingblockstoconstructamorecomplicateddecomposition.Thetaxonomyisillustratedusinganextensivereviewofexistingdecompositionmethods.
Thetaxonomyissubsequentlyusedasthebasisforanewmeta-decompositionmethodologywhichisdesignedtoautomaticallyselectthebestdecompositionmethodforagivendatabase.Forthispurposeweex-amineameta-decomposerframeworkthatcontainstwophases.Inthefirstphasethemeta-datasetisgeneratedbyevaluatingtheperformanceofsev-eraldecompositionmethodsonagivensetofdatasets.Everydatasetischaracterizedbyasetofwell-knownmeasures,suchasentropyandskew-ness.Inordertosignificantlyextendthedatasetvariety,wesuggesttousearandomizedatasetmanipulator.Inthesecondphaseadecisiontree-basedmeta-decomposeristrainedbyusingthegeneratedmeta-dataset.Thenwhenanewdatasetisprovided,weemploythemeta-decomposertoselectthemostpromisingdecompositionmethod.10Acknowledgment
TheauthorwouldliketothankProf.OdedMaimonforhishelpfulandinspiringcomments.References
1.AliK.M.,PazzaniM.J.,ErrorReductionthroughLearningMultipleDe-scriptions,MachineLearning,24:3,173-202,1996.
2.AnandR,MethrotraK,MohanCK,RankaS.Efficientclassificationformulti-classproblemsusingmodularneuralnetworks.IEEETransNeuralNetworks,6(1):117-125,1995.
3.Baxt,W.G.,Useofanartificialneuralnetworkfordataanalysisinclinicaldecisionmaking:Thediagnosisofacutecoronaryocclusion.NeuralCompu-tation,2(4):480-4,1990.
4.Bay,S.,Nearestneighborclassificationfrommultiplefeaturesubsets.Intelli-gentDataAnalysis,3(3):191-209,1999.
5.BensusanH.andKalousisA.,EstimatingthePredictiveAccuracyofaClas-sifier,InProc.Proceedingsofthe12thEuropeanConferenceonMachineLearning,pages25-36,2001.
6.BerryM.,andLinoffG.,MasteringDataMining,JohnWiley&Sons,2000.7.BhargavaH.K.,DataMiningbyDecomposition:AdaptiveSearchforHy-pothesisGeneration,INFORMSJournalonComputingVol.11,Iss.3,pp.239-47,1999.
8.Biermann,A.W.,Faireld,J.,andBeres,T.,1982.Signaturetablesystemsandlearning.IEEETrans.Syst.ManCybern.,12(5):635-8.
24LiorRokach
9.BlumA.,andMitchellT.,CombiningLabeledandUnlabeledDatawithCo-Training.InProc.ofthe11thAnnualConferenceonComputationalLearningTheory,pages92-100,1998.
10.BreimanL.,Baggingpredictors,MachineLearning,24(2):123-140,1996.11.Buntine,W.,“GraphicalModelsforDiscoveringKnowledge”,inU.Fayyad,
G.Piatetsky-Shapiro,P.Smyth,andR.Uthurusamy,editors,AdvancesinKnowledgeDiscoveryandDataMining,pp59-82.AAAI/MITPress,1996.12.ChanP.K.andStolfoS.J,OntheAccuracyofMeta-learningforScalable
DataMining,J.IntelligentInformationSystems,8:5-28,1997.
13.ChenK.,WangL.andChiH.,MethodsofCombiningMultipleClassifiers
withDifferentFeaturesandTheirApplicationstoText-IndependentSpeakerIdentification,InternationalJournalofPatternRecognitionandArtificialIn-telligence,11(3):417-445,1997.
14.Cherkauer,K.J.,HumanExpert-LevelPerformanceonaScientificImage
AnalysisTaskbyaSystemUsingCombinedArtificialNeuralNetworks.InWorkingNotes,IntegratingMultipleLearnedModelsforImprovingandScal-ingMachineLearningAlgorithmsWorkshop,ThirteenthNationalConferenceonArtificialIntelligence.Portland,OR:AAAIPress,1996.
15.Dietterich,T.G.,andGhulumBakiri.Solvingmulticlasslearningproblems
viaerror-correctingoutputcodes.JournalofArtificialIntelligenceResearch,2:263-286,1995.
16.Domingos,P.,&Pazzani,M.,OntheOptimalityoftheNaiveBayesClassifier
underZero-OneLoss,MachineLearning,29:2,103-130,1997.
17.Domingos,P.,UsingPartitioningtoSpeedUpSpecific-to-GeneralRuleIn-duction.InProceedingsoftheAAAI-96WorkshoponIntegratingMultipleLearnedModels,pp.29-34,AAAIPress,1996.
18.Fischer,B.,“DecompositionofTimeSeries-ComparingDifferentMethods
inTheoryandPractice”,EurostatWorkingPaper,1995.
19.Friedman,J.H.,“MultivariateAdaptiveRegressionSplines”,TheAnnualOf
Statistics,19,1-141,1991.20.F¨urnkranz,J.,Moreefficientwindowing,InProceedingofThe14thnational
ConferenceonArtificialIntelegence(AAAI-97),pp.509-514,Providence,RI.AAAIPress,1997.
21.GamaJ.,ALinear-BayesClassifier.InC.Monard,editor,AdvancesonArti-ficialIntelligence–SBIA2000.LNAI1952,pp269-279,SpringerVerlag,200022.Giraud–CarrierC.,VilaltaR.,BrazdilP.,IntroductiontotheSpecialIssueof
onMeta-Learning,MachineLearning,54(3),197-194,2004.
23.GuoY.andSutiwaraphunJ.,KnowledgeprobingindistributedDataMining,
inProc.4hInt.Conf.KnowledgeDiscoveryDataMining,pp61-69,1998.24.Hampshire,J.B.,andWaibel,A.Themeta-Pinetwork-buildingdistributed
knowledgerepresentationsforrobustmultisourcepattern-recognition.PatternAnalysesandMachineIntelligence14(7):751-769,1992.
25.HansenJ.,CombiningPredictors.MetaMachineLearningMethodsandBias,
Variance&AmbiguityDecompositions.PhDdissertation.AurhusUniversity.2000.
26.HeD.W.,StregeB.,TolleH.,andKusiakA.,DecompositioninAutomatic
GenerationofPetriNetsforManufacturingSystemControlandScheduling,InternationalJournalofProductionResearch,38(6):1437-1457,2000.
27.Holmstrom,L.,Koistinen,P.,Laaksonen,J.,andOja,E.,Neuralandstatisti-calclassifiers-taxonomyandacasestudy.IEEETrans.onNeuralNetworks,8,:5–17,1997.
TitleSuppressedDuetoExcessiveLength25
28.HrycejT.,ModularLearninginNeuralNetworks.NewYork:Wiley,1992.29.Hu,X.,UsingRoughSetsTheoryandDatabaseOperationstoConstruct
aGoodEnsembleofClassifiersforDataMiningApplications.ICDM01.pp233-240,2001.
30.JenkinsR.andYuhas,B.P.Asimplifiedneuralnetworksolutionthrough
problemdecomposition:ThecaseofTruckbacker-upper,IEEETransactionsonNeuralNetworks4(4):718-722,1993.
31.JohansenT.A.andFossB.A.,Anarmaxmodelrepresentationforadaptive
controlbasedonlocalmodel-Modeling,IdentificationandControl,13(1):25-39,1992.
32.Jordan,M.I.,andJacobs,R.A.,Hierarchicalmixturesofexpertsandthe
EMalgorithm.NeuralComputation,6,181-214,1994.
33.Kargupta,H.andChanP.,eds,AdvancesinDistributedandParallelKnowl-edgeDiscovery,pp.185-210,AAAI/MITPress,2000.
34.KohaviR.andJohnG.,TheWrapperApproach,InFeatureExtraction,Con-structionandSelection:ADataMiningPerspective,H.LiuandH.Motoda(eds.),KluwerAcademicPublishers,1998.
35.KohaviR.,BeckerB.,andSommerfieldD.,ImprovingsimpleBayes.InPro-ceedingsoftheEuropeanConferenceonMachineLearning,1997.
36.Kononenko,I.,ComparisonofinductiveandNaiveBayeslearningapproaches
toautomaticknowledgeacquisition.InB.Wielinga(Ed.),CurrentTrendsinKnowledgeAcquisition,Amsterdam,TheNetherlandsIOSPress,1990.
37.Kusiak,E.Szczerbicki,andK.Park,ANovelApproachtoDecomposition
ofDesignSpecificationsandSearchforSolutions,InternationalJournalofProductionResearch,29(7):1391-1406,1991.
38.Kusiak,A.,“DecompositioninDataMining:AnIndustrialCaseStudy”,
IEEETransactionsonElectronicsPackagingManufacturing,Vol.23,No.4,pp.345-353,2000.
39.LiaoY.,andMoodyJ.,ConstructingHeterogeneousCommitteesviaInput
FeatureGrouping,inAdvancesinNeuralInformationProcessingSystems,Vol.12,S.A.Solla,T.K.LeenandK.-R.Muller(eds.),MITPress,2000.
40.LongC.,Bi-DecompositionofFunctionSetsUsingMulti-ValuedLogic,Eng.
Doc.Dissertation,TechnischenUniversitatBergakademieFreiberg2003.41.LuB.L.,ItoM.,TaskDecompositionandModuleCombinationBasedon
ClassRelations:AModularNeuralNetworkforPatternClassification,IEEETrans.onNeuralNetworks,10(5):1244-1256,1999.
42.MaimonO.andRokachL.,“DataMiningbyAttributeDecompositionwith
semiconductorsmanufacturingcasestudy”,In“DataMiningforDesignandManufacturing:MethodsandApplications”,KluwerAcademicPublishers,pp.311-336,2001.
43.MaimonO.andRokachL.,“DecompositionMethodologyforKnowledgeDis-coveryandDataMining:TheoryandApplications”,WorldScientific,2005.44.Merz,C.J.andMurphy.P.M.,UCIRepositoryofmachinelearningdatabases.
Irvine,CA:UniversityofCalifornia,DepartmentofInformationandCom-puterScience,1998.
45.Michie,D.,Problemdecompositionandthelearningofskills,inProceedings
oftheEuropeanConferenceonMachineLearning,pp.17-31,Springer-Verlag,1995.
46.NowlanS.J.,andHintonG.E.Evaluationofadaptivemixturesofcompeting
experts.InAdvancesinNeuralInformationProcessingSystems,R.P.Lipp-mann,J.E.Moody,andD.S.Touretzky,Eds.,vol.3,pp.774-780,MorganKaufmannPublishersInc.,1991.
26LiorRokach
47.Ohno-Machado,L.,andMusen,M.A.Modularneuralnetworksformedical
prognosis:Quantifyingthebenefitsofcombiningneuralnetworksforsurvivalprediction.ConnectionScience9,1,1997,71-86.
48.Peng,F.andJacobsR.A.,andTannerM.A.,BayesianInferenceinMixtures-of-ExpertsandHierarchicalMixtures-of-ExpertsModelsWithanApplicationtoSpeechRecognition,JournaloftheAmericanStatisticalAssociation,1995.49.Pratt,L.Y.,Mostow,J.,andKammC.A.,DirectTransferofLearnedIn-formationAmongNeuralNetworks,in:ProceedingsoftheNinthNationalConferenceonArtificialIntelligence,Anaheim,CA,584-5,1991.
50.Prodromidis,A.L.,Stolfo,S.J.andChan,P.K.,Effectiveandefficientprun-ingofmetaclassifiersinadistributeddataminingsystem.TechnicalreportCUCS-017-99,ColumbiaUniv.,1999.
51.Provost,F.J.andKolluri,V.,ASurveyofMethodsforScalingUpInduc-tiveLearningAlgorithms,Proc.3rdInternationalConferenceonKnowledgeDiscoveryandDataMining,1997.
52.Quinlan,J.R.,C4.5:ProgramsforMachineLearning,MorganKaufmann,
LosAltos,1993.
53.Rahman,A.F.R.,andFairhurst,M.C.Anewhybridapproachincombin-ingmultipleexpertstorecognizehandwrittennumerals.PatternRecognitionLetters,18:781-790,1997.
54.Ramamurti,V.,andGhosh,J.,StructurallyAdaptiveModularNetworksfor
Non-StationaryEnvironments,IEEETransactionsonNeuralNetworks,10(1):152-160,1999.
55.R’enyiA.,ProbabilityTheory,North-Holland,Amsterdam,1970.
56.Ridgeway,G.,Madigan,D.,Richardson,T.andO’Kane,J.,Interpretable
BoostedNaiveBayesClassification,ProceedingsoftheFourthInternationalConferenceonKnowledgeDiscoveryandDataMining,pp101-104,1998.57.RokachL.,MaimonO.,“DecompositionMethodologyforClassification
Tasks”,ProceedingsoftheIEEEInternationalConferenceonGranularCom-puting,Beijing,July2005,IEEEComputerSocietyPress,ISBN:0-7803-9017-2,pp636-1.
58.RokachL.andMaimonO.,“Featuresetdecompositionfordecisiontrees”,
IntelligentDataAnalysis,6(2):1-28,2005.
59.RokachL.,MaimonO.,AradO.,“ImprovingSupervisedLearningbySam-pleDecomposition”,InternationalJournalofComputationalIntelligenceandApplications,5(1):37-54,2005.
60.RokachL.,MaimonO.andLaviI.,“SpaceDecompositionInDataMining:
AClusteringApproach”,Proceedingsofthe14thInternationalSymposiumOnMethodologiesForIntelligentSystems,Maebashi,Japan,LectureNotesinComputerScience,Springer-Verlag,2003,pp.24–31.
61.Ronco,E.,Gollee,H.,andGawthrop,P.J.,Modularneuralnetworkandself-decomposition.CSCResearchReportCSC-96012,CentreforSystemsandControl,UniversityofGlasgow,1996.
62.Saaty,X.,Theanalytichierarchyprocess:A1993overview.CentralEuropean
JournalforOperationsResearchandEconomics,Vol.2,No.2,p.119-137,1993.
63.Samuel,A.,SomestudiesinmachinelearningusingthegameofcheckersII:
Recentprogress.IBMJ.Res.Develop.,11:601-617,1967.
.Schaffer,C.,Selectingaclassificationmethodbycross-validation.Machine
Learning13(1):135-143,1993.
TitleSuppressedDuetoExcessiveLength27
65.Sharkey,A.,Oncombiningartificialneuralnets,ConnectionScience,Vol.8,
pp.299-313,1996.
66.Sharkey,A.,Multi-NetIystems,InSharkeyA.(Ed.)CombiningArtifi-cialNeuralNetworks:EnsembleandModularMulti-NetSystems.pp.1-30,Springer
-Verlag,1999.
67.TsallisC.,PossibleGeneralizationofBoltzmann-GibbsStatistics,J.
Stat.Phys.,52,479-487,1988.
68.Tumer,K.,andGhoshJ.,LinearandOrderStatisticsCombinersforPattern
Classification,inCombiningArticialNeuralNets,A.Sharkey(Ed.),pp.127-162,Springer-Verlag,1999.
69.VilaltaR.,Giraud–CarrierC.,BrazdilP.,“Meta-Learning”,inO.Maimon
andL.Rokach(Eds.),HandbookofDataMiningandKnowledgeDiscoveryinDatabases,pp.731-748,Springer,2005.
70.Weigend,A.S.,Mangeas,M.,andSrivastava,A.N.Nonlineargatedexperts
fortime-series-discoveringregimesandavoidingoverfitting.InternationalJournalofNeuralSystems6(5):373-399,1995.
71.Zaki,M.J.,HoC.T.,Eds.,Large-ScaleParallelDataMining.NewYork:
Springer-Verlag,2000.
72.Zaki,M.J.,HoC.T.,andAgrawal,R.,Scalableparallelclassificationfor
DataMiningonshared-memorymultiprocessors,inProc.IEEEInt.Conf.DataEng.,Sydney,Australia,WKDD99,pp.198–205,1999.
73.Zupan,B.,Bohanec,M.,DemsarJ.,andBratko,I.,Featuretransformation
byfunctiondecomposition,IEEEintelligentsystems&theirapplications,13:38-43,1998.
因篇幅问题不能全部显示,请点此查看更多更全内容
Copyright © 2019- obuygou.com 版权所有 赣ICP备2024042798号-5
违法及侵权请联系:TEL:199 18 7713 E-MAIL:2724546146@qq.com
本站由北京市万商天勤律师事务所王兴未律师提供法律服务