Upload
others
View
33
Download
0
Embed Size (px)
Citation preview
Converting Handwritten Mathematical Expressions into LaTeX Norah Borus (nborus), William Bakst (wbakst), Amit Schechter (amitsch)
Motivation
Datasets
• Typing up mathema.cal equa.ons has becomequite a necessity when submi6ng academicpapers, wri.ng and solving problem sets,presen.ng mathema.cal concepts, and muchmore.
• We’re looking to automate the process ofconver.ng handwri@en expressions to LaTeXusingmodernMachineLearningtechniques.
Features
• Our dataset includes the traces of 10,000handwri@en expressions mapped to thecorresponding ground truth LaTeX expression(obtainedfromKaggle).
• We split the data into 80% train, 10% dev, and10% test. We use traces for CSeg, normalizedpixelarrays forOCR,and traceswithsegmentedcharactersforSA.
• Example:
$(x-x^3)^2+(y-y^3)^2=r^2$
Discussion
Experimental Results Methods
Future Work • CSeg:implementacombinedmodelusingNNbeamsearchandSVMpredic.ons
toincreaseaccuracy.UseAdaBoostandmul.-scaleshapecontextfeatures.• OCR: Experiment with advanced OCR techniques such as Random Forests to
help improve the accuracy of our CNN and SVM. Use data augmenta.ontechniquestoaddexamplesofuncommonsymbols.
• SA:Weplanonfindingor implemen.nga LaTeXparser thatwill enableus totrain an SVM to aid our current model in correctly building square root,exponents,andothermorecomplexequa.ons.
• E2E: implement a CNN for an end to end approach (similar to Google’sTesseract).
• Segmenta.on:featurescomefromdata.SVMusesfeatures including overlap, inverse horizontal,ver.cal, and centroids distance, and horizontalcontainment, which are the main indicators forgroupingtraces.
• Character Recogni.on: SVM and NN use databased features include normalized fla@enedgreyscalepixelarray.
WederivedHistogramofOrientedGradients(HOG)fromthepixelarray–improvesimagerecogni.onbyusingoverlappinglocalcontrastnormaliza.on.
CharacterSegmenta.on(CSeg)
CharacterRecogni.on(OCR)
StructuralAnalysis(SA)
CSegAccuracy:OCRAccuracy:
• Most character recogni.on errors come from symbols that do notappearasfrequentlyindataset(e.g.\mu).
• Segmenta.on: NN beam search and the binary SVM achieve similaraccuracy(64%).Beamsearchtendstogrouptracesthatshouldremainseparated.BinarySVMtendstokeepstrokesseparated.
• We are working on improving our Structural Analysis. We arestruggling with how to accurately test the output and train modelsandarecurrentlyrepor.ngaccuracybyhandtopreservecorrectness.
A
Input:normalizedpixelarray.Output:character’sLaTexrepresenta.on.• SVM:usesmul.classSVMwithalinearKernel,L2regulariza.on.Mul.classSVMloss:
• NN(mul;class):usessameinternalstructureasNNinCSeg.Outputistheclassifica.onoftheLaTexsymbol.
• Crossentropyloss:• CNN:mul.classCNNwith5hiddenlayers,ReLUac.va.onfunc.on,and
crossentropyloss.
Input:listofalltracesforequa.on.Output:tracesgroupedbycharacters.
• Overlap:mergesoverlappingstrokes.Allotherstrokesareseparate.• SVM:usesabinarySVMwithalinearKernel,L2regulariza.on,andC
value50.BinarySVMlossfunc.on:• NN(binaryclassifica;on):withsinglehiddenlayer,sohmaxandsigmoid
ac.va.on,andcrossentropyloss.Inputisa32X32fla@enedpixelarrayofasingleormul.plestrokes.OutputiswhetherornottheimageisavalidLaTexsymbol.
• BeamSearch:weusebeamsearchtoexplorepossiblesegmenta.ons.Wemaximizethesegmenta.onscore,whichisbasedonNNpredic.on(confidenceofitbeingavalidcharacter).
NextΣ ϕ ( x )Next Next Next
Sup
Sub
$\sum_{i=1}^{m}\phi(x)$
m
i=1
Baseline Heuris;cs
Test 14% 65%
SAAnalysisAccuracy:
Test Train
CNN 44% NA
Overlap 53% 53%
SVM 64% 96%
BinaryNN 78% 97%
Beamsearch 64% NA
References:• M.Thoma–“[email protected]”,2015,Bachelor’sTthesis,KIT.• ScikitLearn.h@p://scikit-learn.org/stable/modules/generated/sklearn.svm.LinearSVC.html.• Pytorch.h@p://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html.
• Structureanalysisisdoneaccordingtorela.veloca.onandsize.
• Weusefeaturessuchasoverlap,squaredlossfromsuperscript/subscriptboundingboxtodeterminetherela.onshipbetweentwocharacters.
• Wethenusetheserela.onshipstorebuildthegroundtruthLaTeX.
0%20%40%60%80%100%
CNN SVM NN
Train
Test