Deep learning
Introduction
Deeplearningisageneraltermforatypeofpatternanalysismethod.Intermsofspecificresearchcontent,itmainlyinvolvesthreetypesofmethods:
(1)BasedonvolumeTheneuralnetworksystemofproductoperation,namelyConvolutionalNeuralNetwork(CNN).
(2)Self-encodingneuralnetworksbasedonmulti-layerneuronsincludetwotypes:AutoencoderandSparseCoding,whichhavereceivedwidespreadattentioninrecentyears.
(3)DeepBeliefNetwork(DBN)thatusesmulti-layerself-encodingneuralnetworkforpre-training,andthencombinesidentificationinformationtofurtheroptimizeneuralnetworkweights.
Throughmulti-layerprocessing,theinitial"low-level"featurerepresentationisgraduallytransformedintoa"high-level"featurerepresentation,andthena"simplemodel"canbeusedtocompletecomplexclassificationandotherlearningtasks.Therefore,deeplearningcanbeunderstoodas"featurelearning"or"representationlearning".
Inthepast,whenmachinelearningwasusedforreal-worldtasks,thecharacteristicsofthedescriptionsampleswereusuallydesignedbyhumanexperts,whichiscalled"featureengineering".Asweallknow,thequalityoffeatureshasacrucialimpactongeneralizationperformance,anditisnoteasyforhumanexpertstodesigngoodfeatures;featurelearning(representationlearning)generatesgoodfeaturesthroughmachinelearningtechnologyitself,whichmakesmachinelearning"Fullyautomateddataanalysis"isanotherstepforward.
Inrecentyears,researchershavegraduallycombinedthesetypesofmethods,suchasunsupervisedpre-trainingofconvolutionalneuralnetworksthatwereoriginallybasedonsupervisedlearningandauto-encodingneuralnetworks,andthenAconvolutionaldeepconfidencenetworkformedbyfine-tuningnetworkparametersusingidentificationinformation.Comparedwithtraditionallearningmethods,deeplearningmethodspresetmoremodelparameters,somodeltrainingismoredifficult.Accordingtothegenerallawofstatisticallearning,weknowthatthemoremodelparameters,thegreatertheamountofdatathatneedstobetrained.
Inthe1980sand1990s,duetothelimitedcomputingpowerofcomputersandthelimitationsofrelatedtechnologies,theamountofdataavailableforanalysiswastoosmall,anddeeplearningdidnotshowexcellentrecognitionperformanceinpatternanalysis.Since2006,Hintonetal.proposedtheCD-KalgorithmtoquicklycalculatetheweightsanddeviationsofrestrictedBoltzmannmachine(RBM)networks.RBMhasbecomeapowerfultooltoincreasethedepthofneuralnetworks,leadingtothewidespreaduseofDBN(DevelopedbyHintonandothersandhasbeenusedinspeechrecognitionbycompaniessuchasMicrosoft)andotherdeepnetworksappeared.Atthesametime,sparsecodingisalsousedindeeplearningbecauseitcanautomaticallyextractfeaturesfromdata.Convolutionalneuralnetworkmethodsbasedonlocaldataregionshavealsobeenextensivelystudiedthisyear.
Interpretation
Deeplearningisatypeofmachinelearning,andmachinelearningisthenecessarypathtorealizeartificialintelligence.Theconceptofdeeplearningoriginatesfromtheresearchofartificialneuralnetworks,andamulti-layerperceptronwithmultiplehiddenlayersisadeeplearningstructure.Deeplearningformsamoreabstracthigh-levelrepresentationattributecategoryorfeaturebycombininglow-levelfeaturestodiscoverdistributedfeaturerepresentationsofdata.Themotivationforstudyingdeeplearningistobuildaneuralnetworkthatsimulatesthehumanbrainforanalysisandlearning.Itmimicsthemechanismofthehumanbraintointerpretdata,suchasimages,sounds,andtexts.
Thecalculationinvolvedingeneratinganoutputfromaninputcanberepresentedbyaflowgraph:aflowgraphisagraphthatcanrepresentcalculations,inwhicheachnoderepresentsAbasiccalculationandacalculatedvalue,theresultofthecalculationisappliedtothevalueofthechildnodesofthisnode.Considersuchacalculationset,whichcanbeallowedineverynodeandpossiblegraphstructure,anddefinesafamilyoffunctions.Theinputnodehasnoparentnode,andtheoutputnodehasnochildnodes.
Aspecialattributeofthisflowgraphisdepth:thelengthofthelongestpathfromaninputtoanoutput.
Traditionalfeedforwardneuralnetworkscanbeseenashavingadepthequaltothenumberoflayers(forexample,thenumberofhiddenlayersplus1fortheoutputlayer).SVMshaveadepthof2(onecorrespondingtothekerneloutputorfeaturespace,andtheothercorrespondingtothelinearmixtureofthegeneratedoutput).
Oneofthedirectionsofartificialintelligenceresearchisrepresentedbytheso-called"expertsystem",definedbyalargenumberof"If-Then"(If-Then)rules,top-downthinking.ArtificialNeuralNetwork(ArtificialNeuralNetwork)marksanotherkindofbottom-upthinking.Thereisnostrictformaldefinitionofneuralnetwork.Itsbasicfeatureistotrytoimitatethemodeofinformationtransmissionandprocessingbetweenneuronsinthebrain.
Features
Differentfromtraditionalshallowlearning,thedifferenceofdeeplearningis:
(1)Emphasizesthedepthofthemodelstructure,usuallythereare5layers,6layers,oreven10layersofhiddennodes;
(2)clarifiestheimportanceoffeaturelearning.Inotherwords,bylayer-by-layerfeaturetransformation,thefeaturerepresentationofthesampleintheoriginalspaceistransformedintoanewfeaturespace,therebymakingclassificationorpredictioneasier.Comparedwiththemethodofconstructingfeaturesbyartificialrules,theuseofbigdatatolearnfeaturesismorecapableofportrayingdata-richinternalinformation.
Bydesigningtoestablishapropernumberofneuroncomputingnodesandamulti-layercomputinghierarchy,selecttheappropriateinputlayerandoutputlayer,andestablishafunctionalrelationshipfrominputtooutputthroughnetworklearningandtuning,Althoughthefunctionalrelationshipbetweeninputandoutputcannotbefound100%,itispossibletoapproximatetheactualrelationshipasmuchaspossible.Usingasuccessfullytrainednetworkmodel,wecanachieveourautomationrequirementsforcomplextransactionprocessing.
Typicaldeeplearningmodels
Typicaldeeplearningmodelsincludeconvolutionalneuralnetwork,DBNandstackedauto-encodernetworkmodels,etc.,Thesemodelsaredescribedbelow.
ConvolutionalNeuralNetworkModel
Beforetheemergenceofunsupervisedpre-training,trainingdeepneuralnetworksisusuallyverydifficult,andoneofthespecialcasesisconvolutionalneuralnetworks.Convolutionalneuralnetworksareinspiredbythestructureofthevisualsystem.ThefirstconvolutionalneuralnetworkcalculationmodelwasproposedinFukushima(D'sneurocognitivemachine).Basedonthelocalconnectionsbetweenneuronsandthelayeredorganizationimageconversion,theneuronswiththesameparametersareappliedtothepreviouslayer.Differentpositionsoftheneuralnetworkresultedinatranslation-invariantneuralnetworkstructure.Later,basedonthisidea,LeCunetal.usederrorgradientstodesignandtrainaconvolutionalneuralnetwork,andobtainedsuperiorperformanceinsomepatternrecognitiontasks.Performance.Sofar,thepatternrecognitionsystembasedonconvolutionalneuralnetworkisoneofthebestimplementationsystems,especiallyforhandwrittencharacterrecognitiontasks.Itshowsextraordinaryperformance.
Deeptrustnetworkmodel
DBNcanbeinterpretedasaBayesianprobabilitygenerationmodel,whichiscomposedofmultiplelayersofrandomlatentvariables.Theuppertwolayershaveundirectedsymmetricconnections,andthelowerlayergetstop-downdirectedconnectionsfromtheupperlayer.Thestateofthebottomunitisthevisibleinputdatavector.TheDBNiscomposedofastackof2Fstructuralunits,andthestructuralunitisusuallyRBM(RestIlctedBoltzmannMachine).ThevisiblelayernerveofeachRBMunitinthestackThenumberofelementsisequaltothenumberofhiddenlayerneuronsinthepreviousRBMunit.Accordingtothedeeplearningmechanism,theinputexampleisusedtotrainthefirstlayerofRBMunit,andtheoutputisusedtotrainthesecondlayerofRBMmodel,andtheRBMmodelisstackedbyaddinglayerstoImprovemodelperformance.Intheunsupervisedpre-trainingprocess,afterDBNencodingisinputtothetopRBM,thestateofthetoplayerisdecodedtothebottomunittorealizethereconstructionoftheinput.AsthestructuralunitofDBN,RBMsharesparameterswitheachlayerofDBN
Stackedself-encodingnetworkmodel
Thestructureofthestackedself-encodingnetworkissimilartoDBN.Itconsistsofastackofseveralstructuralunits.Thedifferenceisthatthestructuralunitisaself-encodingmodel(auto-en-coder)insteadofRBM.Theself-encodingmodelisatwo-layerneuralnetwork,thefirstlayeriscalledtheencodinglayer,andthesecondlayeriscalledthedecodinglayer.
Deeplearningtrainingprocess
In2006,Hintonproposedaneffectivemethodtobuildamulti-layerneuralnetworkonunsuperviseddata,whichisdividedintotwosteps:first,buildasinglelayerofneuronslayerbylayer,sothateachtimeasinglelayernetworkistrained;Afterallthelayersaretrained,usethewake-sleepalgorithmfortuning.
Changetheweightsbetweentheotherlayersexceptthetoplayertobidirectional,sothatthetoplayerisstillasingle-layerneuralnetwork.Theotherlayersbecomeagraphmodel.Theupwardweightisusedfor"cognition",andthedownwardweightisusedfor"generation".Thenusethewake-sleepalgorithmtoadjustalltheweights.Letcognitionandgenerationagree,thatisEnsurethatthegeneratedtop-levelrepresentationcanrestorethebottom-levelnodesasaccuratelyaspossible.Forexample,anodeatthetop-levelrepresentsahumanface,thentheimageofallfacesshouldactivatethisnode,andtheresultisTheimagegeneratedbelowshouldbeabletorepresentaroughfaceimage.Thewake-sleepalgorithmisdividedintotwoparts:wakeandsleep.
Wakestage:thecognitiveprocess,throughthecharacteristicsoftheoutsideworldandtheupwardweighttoproduceanabstractrepresentationofeachlayer,andusegradientdescenttomodifythedownwardweightbetweenlayers.
Sleepstage:thegenerationprocess,throughthetop-levelrepresentationanddownwardweights,generatethestateofthebottomlayer,andmodifytheupwardweightsbetweenlayers.
Bottom-upunsupervisedlearning
Itstartsfromthebottomlayerandtrainslayerbylayertothetoplayer.Usinguncalibrateddata(calibrateddatacanalsobeused)totraintheparametersofeachlayerinlayers,thisstepcanberegardedasanunsupervisedtrainingprocess,whichisalsothemostdifferentpartfromtraditionalneuralnetworks,andcanberegardedasafeaturelearningprocess.Specifically,thefirstlayeristrainedwithuncalibrateddata,andtheparametersofthefirstlayerarelearnedduringtraining.Thislayercanberegardedasahiddenlayerofathree-layerneuralnetworkthatminimizesthedifferencebetweenoutputandinput.Restrictionsandsparsityconstraintsenabletheresultingmodeltolearnthestructureofthedataitself,therebyobtainingfeaturesthataremoreexpressivethantheinput;afterlearningthenllayer,theoutputofthenllayerisusedastheinputofthenthlayer,andthetrainingnlayers,sothattheparametersofeachlayerareobtainedrespectively.
Top-downsupervisedlearning
Itistotrainthroughlabeleddata,theerroristransmittedfromthetoptothebottom,andthenetworkisfine-tuned.Theparametersofamulti-layermodelarefurtheroptimizedandadjustedbasedontheparametersofeachlayerobtainedinthefirststep.Thisstepisasupervisedtrainingprocess.Thefirststepissimilartotheprocessofrandominitializationoftheneuralnetwork.Sincethefirststepisnotrandominitialization,butobtainedbylearningthestructureoftheinputdata,theinitialvalueisclosertotheglobaloptimum,whichcanachievebetterresults.Therefore,thegoodeffectofdeeplearningislargelyattributedtothefirststepofthefeaturelearningprocess.
Applications
ComputerVision
TheMultimediaLaboratoryoftheChineseUniversityofHongKongisthefirstChineseteamtoapplydeeplearningforcomputervisionresearch.Intheworld-classartificialintelligencecompetitionLFW(Large-scaleFaceRecognitionCompetition),thelaboratoryhasbeatenFaceBooktowinthechampionship,makingtherecognitionabilityofartificialintelligenceinthisfieldsurpassrealpeopleforthefirsttime.
Speechrecognition
Incooperationwithhinton,MicrosoftresearchersfirstintroducedRBMandDBNintothetrainingofspeechrecognitionacousticmodels,andachievedgreatsuccessinlargevocabularyspeechrecognitionsystems,Makingtheerrorrateofspeechrecognitionrelativelyreducedby30%.However,DNNdoesnotyethaveeffectiveparallelfastalgorithms.Manyresearchinstitutionsareusinglarge-scaledatacorpustoimprovethetrainingefficiencyofDNNacousticmodelsthroughGPUplatforms.
Internationally,companiessuchasIBMandGooglehaveconductedresearchonDNNspeechrecognitionquickly,andthespeedisveryfast.
InChina,companiesorresearchunitssuchasAlibaba,iFLYTEK,Baidu,andtheInstituteofAutomationoftheChineseAcademyofSciencesarealsoconductingresearchondeeplearninginspeechrecognition.
Naturallanguageprocessingandotherfields
Manyinstitutionsareconductingresearch.In2013,TomasMikolov,KaiChen,GregCorrado,JeffreyDeanpublishedapaperEfficientEstimationofWordRepresentationsinVectorSpaceEstablishaword2vectormodel.Comparedwiththetraditionalbagofwordsmodel,word2vectorcanbetterexpressgrammaticalinformation.Deeplearningismainlyappliedtomachinetranslationandsemanticmininginfieldssuchasnaturallanguageprocessing.
Latest: The second law of thermodynamics
Next: Auxiliary variable