Deep learning

honggarae 15/12/2021 1031

Introduction

Deeplearningisageneraltermforatypeofpatternanalysismethod.Intermsofspecificresearchcontent,itmainlyinvolvesthreetypesofmethods:

(1)BasedonvolumeTheneuralnetworksystemofproductoperation,namelyConvolutionalNeuralNetwork(CNN).

(2)Self-encodingneuralnetworksbasedonmulti-layerneuronsincludetwotypes:AutoencoderandSparseCoding,whichhavereceivedwidespreadattentioninrecentyears.

(3)DeepBeliefNetwork(DBN)thatusesmulti-layerself-encodingneuralnetworkforpre-training,andthencombinesidentificationinformationtofurtheroptimizeneuralnetworkweights.

Throughmulti-layerprocessing,theinitial"low-level"featurerepresentationisgraduallytransformedintoa"high-level"featurerepresentation,andthena"simplemodel"canbeusedtocompletecomplexclassificationandotherlearningtasks.Therefore,deeplearningcanbeunderstoodas"featurelearning"or"representationlearning".

Inthepast,whenmachinelearningwasusedforreal-worldtasks,thecharacteristicsofthedescriptionsampleswereusuallydesignedbyhumanexperts,whichiscalled"featureengineering".Asweallknow,thequalityoffeatureshasacrucialimpactongeneralizationperformance,anditisnoteasyforhumanexpertstodesigngoodfeatures;featurelearning(representationlearning)generatesgoodfeaturesthroughmachinelearningtechnologyitself,whichmakesmachinelearning"Fullyautomateddataanalysis"isanotherstepforward.

Inrecentyears,researchershavegraduallycombinedthesetypesofmethods,suchasunsupervisedpre-trainingofconvolutionalneuralnetworksthatwereoriginallybasedonsupervisedlearningandauto-encodingneuralnetworks,andthenAconvolutionaldeepconfidencenetworkformedbyfine-tuningnetworkparametersusingidentificationinformation.Comparedwithtraditionallearningmethods,deeplearningmethodspresetmoremodelparameters,somodeltrainingismoredifficult.Accordingtothegenerallawofstatisticallearning,weknowthatthemoremodelparameters,thegreatertheamountofdatathatneedstobetrained.

Inthe1980sand1990s,duetothelimitedcomputingpowerofcomputersandthelimitationsofrelatedtechnologies,theamountofdataavailableforanalysiswastoosmall,anddeeplearningdidnotshowexcellentrecognitionperformanceinpatternanalysis.Since2006,Hintonetal.proposedtheCD-KalgorithmtoquicklycalculatetheweightsanddeviationsofrestrictedBoltzmannmachine(RBM)networks.RBMhasbecomeapowerfultooltoincreasethedepthofneuralnetworks,leadingtothewidespreaduseofDBN(DevelopedbyHintonandothersandhasbeenusedinspeechrecognitionbycompaniessuchasMicrosoft)andotherdeepnetworksappeared.Atthesametime,sparsecodingisalsousedindeeplearningbecauseitcanautomaticallyextractfeaturesfromdata.Convolutionalneuralnetworkmethodsbasedonlocaldataregionshavealsobeenextensivelystudiedthisyear.

Interpretation

Deeplearningisatypeofmachinelearning,andmachinelearningisthenecessarypathtorealizeartificialintelligence.Theconceptofdeeplearningoriginatesfromtheresearchofartificialneuralnetworks,andamulti-layerperceptronwithmultiplehiddenlayersisadeeplearningstructure.Deeplearningformsamoreabstracthigh-levelrepresentationattributecategoryorfeaturebycombininglow-levelfeaturestodiscoverdistributedfeaturerepresentationsofdata.Themotivationforstudyingdeeplearningistobuildaneuralnetworkthatsimulatesthehumanbrainforanalysisandlearning.Itmimicsthemechanismofthehumanbraintointerpretdata,suchasimages,sounds,andtexts.

Thecalculationinvolvedingeneratinganoutputfromaninputcanberepresentedbyaflowgraph:aflowgraphisagraphthatcanrepresentcalculations,inwhicheachnoderepresentsAbasiccalculationandacalculatedvalue,theresultofthecalculationisappliedtothevalueofthechildnodesofthisnode.Considersuchacalculationset,whichcanbeallowedineverynodeandpossiblegraphstructure,anddefinesafamilyoffunctions.Theinputnodehasnoparentnode,andtheoutputnodehasnochildnodes.

Aspecialattributeofthisflowgraphisdepth:thelengthofthelongestpathfromaninputtoanoutput.

Traditionalfeedforwardneuralnetworkscanbeseenashavingadepthequaltothenumberoflayers(forexample,thenumberofhiddenlayersplus1fortheoutputlayer).SVMshaveadepthof2(onecorrespondingtothekerneloutputorfeaturespace,andtheothercorrespondingtothelinearmixtureofthegeneratedoutput).

Oneofthedirectionsofartificialintelligenceresearchisrepresentedbytheso-called"expertsystem",definedbyalargenumberof"If-Then"(If-Then)rules,top-downthinking.ArtificialNeuralNetwork(ArtificialNeuralNetwork)marksanotherkindofbottom-upthinking.Thereisnostrictformaldefinitionofneuralnetwork.Itsbasicfeatureistotrytoimitatethemodeofinformationtransmissionandprocessingbetweenneuronsinthebrain.

Features

Differentfromtraditionalshallowlearning,thedifferenceofdeeplearningis:

(1)Emphasizesthedepthofthemodelstructure,usuallythereare5layers,6layers,oreven10layersofhiddennodes;

(2)clarifiestheimportanceoffeaturelearning.Inotherwords,bylayer-by-layerfeaturetransformation,thefeaturerepresentationofthesampleintheoriginalspaceistransformedintoanewfeaturespace,therebymakingclassificationorpredictioneasier.Comparedwiththemethodofconstructingfeaturesbyartificialrules,theuseofbigdatatolearnfeaturesismorecapableofportrayingdata-richinternalinformation.

Bydesigningtoestablishapropernumberofneuroncomputingnodesandamulti-layercomputinghierarchy,selecttheappropriateinputlayerandoutputlayer,andestablishafunctionalrelationshipfrominputtooutputthroughnetworklearningandtuning,Althoughthefunctionalrelationshipbetweeninputandoutputcannotbefound100%,itispossibletoapproximatetheactualrelationshipasmuchaspossible.Usingasuccessfullytrainednetworkmodel,wecanachieveourautomationrequirementsforcomplextransactionprocessing.

Typicaldeeplearningmodels

Typicaldeeplearningmodelsincludeconvolutionalneuralnetwork,DBNandstackedauto-encodernetworkmodels,etc.,Thesemodelsaredescribedbelow.

ConvolutionalNeuralNetworkModel

Beforetheemergenceofunsupervisedpre-training,trainingdeepneuralnetworksisusuallyverydifficult,andoneofthespecialcasesisconvolutionalneuralnetworks.Convolutionalneuralnetworksareinspiredbythestructureofthevisualsystem.ThefirstconvolutionalneuralnetworkcalculationmodelwasproposedinFukushima(D'sneurocognitivemachine).Basedonthelocalconnectionsbetweenneuronsandthelayeredorganizationimageconversion,theneuronswiththesameparametersareappliedtothepreviouslayer.Differentpositionsoftheneuralnetworkresultedinatranslation-invariantneuralnetworkstructure.Later,basedonthisidea,LeCunetal.usederrorgradientstodesignandtrainaconvolutionalneuralnetwork,andobtainedsuperiorperformanceinsomepatternrecognitiontasks.Performance.Sofar,thepatternrecognitionsystembasedonconvolutionalneuralnetworkisoneofthebestimplementationsystems,especiallyforhandwrittencharacterrecognitiontasks.Itshowsextraordinaryperformance.

Deeptrustnetworkmodel

DBN​​canbeinterpretedasaBayesianprobabilitygenerationmodel,whichiscomposedofmultiplelayersofrandomlatentvariables.Theuppertwolayershaveundirectedsymmetricconnections,andthelowerlayergetstop-downdirectedconnectionsfromtheupperlayer.Thestateofthebottomunitisthevisibleinputdatavector.TheDBNiscomposedofastackof2Fstructuralunits,andthestructuralunitisusuallyRBM(RestIlctedBoltzmannMachine).ThevisiblelayernerveofeachRBMunitinthestackThenumberofelementsisequaltothenumberofhiddenlayerneuronsinthepreviousRBMunit.Accordingtothedeeplearningmechanism,theinputexampleisusedtotrainthefirstlayerofRBMunit,andtheoutputisusedtotrainthesecondlayerofRBMmodel,andtheRBMmodelisstackedbyaddinglayerstoImprovemodelperformance.Intheunsupervisedpre-trainingprocess,afterDBNencodingisinputtothetopRBM,thestateofthetoplayerisdecodedtothebottomunittorealizethereconstructionoftheinput.AsthestructuralunitofDBN,RBMsharesparameterswitheachlayerofDBN

Stackedself-encodingnetworkmodel

Thestructureofthestackedself-encodingnetworkissimilartoDBN.Itconsistsofastackofseveralstructuralunits.Thedifferenceisthatthestructuralunitisaself-encodingmodel(auto-en-coder)insteadofRBM.Theself-encodingmodelisatwo-layerneuralnetwork,thefirstlayeriscalledtheencodinglayer,andthesecondlayeriscalledthedecodinglayer.

Deeplearningtrainingprocess

In2006,Hintonproposedaneffectivemethodtobuildamulti-layerneuralnetworkonunsuperviseddata,whichisdividedintotwosteps:first,buildasinglelayerofneuronslayerbylayer,sothateachtimeasinglelayernetworkistrained;Afterallthelayersaretrained,usethewake-sleepalgorithmfortuning.

Changetheweightsbetweentheotherlayersexceptthetoplayertobidirectional,sothatthetoplayerisstillasingle-layerneuralnetwork.Theotherlayersbecomeagraphmodel.Theupwardweightisusedfor"cognition",andthedownwardweightisusedfor"generation".Thenusethewake-sleepalgorithmtoadjustalltheweights.Letcognitionandgenerationagree,thatisEnsurethatthegeneratedtop-levelrepresentationcanrestorethebottom-levelnodesasaccuratelyaspossible.Forexample,anodeatthetop-levelrepresentsahumanface,thentheimageofallfacesshouldactivatethisnode,andtheresultisTheimagegeneratedbelowshouldbeabletorepresentaroughfaceimage.Thewake-sleepalgorithmisdividedintotwoparts:wakeandsleep.

Wakestage:thecognitiveprocess,throughthecharacteristicsoftheoutsideworldandtheupwardweighttoproduceanabstractrepresentationofeachlayer,andusegradientdescenttomodifythedownwardweightbetweenlayers.

Sleepstage:thegenerationprocess,throughthetop-levelrepresentationanddownwardweights,generatethestateofthebottomlayer,andmodifytheupwardweightsbetweenlayers.

Bottom-upunsupervisedlearning

Itstartsfromthebottomlayerandtrainslayerbylayertothetoplayer.Usinguncalibrateddata(calibrateddatacanalsobeused)totraintheparametersofeachlayerinlayers,thisstepcanberegardedasanunsupervisedtrainingprocess,whichisalsothemostdifferentpartfromtraditionalneuralnetworks,andcanberegardedasafeaturelearningprocess.Specifically,thefirstlayeristrainedwithuncalibrateddata,andtheparametersofthefirstlayerarelearnedduringtraining.Thislayercanberegardedasahiddenlayerofathree-layerneuralnetworkthatminimizesthedifferencebetweenoutputandinput.Restrictionsandsparsityconstraintsenabletheresultingmodeltolearnthestructureofthedataitself,therebyobtainingfeaturesthataremoreexpressivethantheinput;afterlearningthenllayer,theoutputofthenllayerisusedastheinputofthenthlayer,andthetrainingnlayers,sothattheparametersofeachlayerareobtainedrespectively.

Top-downsupervisedlearning

Itistotrainthroughlabeleddata,theerroristransmittedfromthetoptothebottom,andthenetworkisfine-tuned.Theparametersofamulti-layermodelarefurtheroptimizedandadjustedbasedontheparametersofeachlayerobtainedinthefirststep.Thisstepisasupervisedtrainingprocess.Thefirststepissimilartotheprocessofrandominitializationoftheneuralnetwork.Sincethefirststepisnotrandominitialization,butobtainedbylearningthestructureoftheinputdata,theinitialvalueisclosertotheglobaloptimum,whichcanachievebetterresults.Therefore,thegoodeffectofdeeplearningislargelyattributedtothefirststepofthefeaturelearningprocess.

Applications

ComputerVision

TheMultimediaLaboratoryoftheChineseUniversityofHongKongisthefirstChineseteamtoapplydeeplearningforcomputervisionresearch.Intheworld-classartificialintelligencecompetitionLFW(Large-scaleFaceRecognitionCompetition),thelaboratoryhasbeatenFaceBooktowinthechampionship,makingtherecognitionabilityofartificialintelligenceinthisfieldsurpassrealpeopleforthefirsttime.

Speechrecognition

Incooperationwithhinton,MicrosoftresearchersfirstintroducedRBMandDBNintothetrainingofspeechrecognitionacousticmodels,andachievedgreatsuccessinlargevocabularyspeechrecognitionsystems,Makingtheerrorrateofspeechrecognitionrelativelyreducedby30%.However,DNNdoesnotyethaveeffectiveparallelfastalgorithms.Manyresearchinstitutionsareusinglarge-scaledatacorpustoimprovethetrainingefficiencyofDNNacousticmodelsthroughGPUplatforms.

Internationally,companiessuchasIBMandGooglehaveconductedresearchonDNNspeechrecognitionquickly,andthespeedisveryfast.

InChina,companiesorresearchunitssuchasAlibaba,iFLYTEK,Baidu,andtheInstituteofAutomationoftheChineseAcademyofSciencesarealsoconductingresearchondeeplearninginspeechrecognition.

Naturallanguageprocessingandotherfields

Manyinstitutionsareconductingresearch.In2013,TomasMikolov,KaiChen,GregCorrado,JeffreyDeanpublishedapaperEfficientEstimationofWordRepresentationsinVectorSpaceEstablishaword2vectormodel.Comparedwiththetraditionalbagofwordsmodel,word2vectorcanbetterexpressgrammaticalinformation.Deeplearningismainlyappliedtomachinetranslationandsemanticmininginfieldssuchasnaturallanguageprocessing.

Latest: The second law of thermodynamics

Next: Auxiliary variable