regex - use shell command tesseract in perl script to print a text output -
hi have script want write, first took html image, , wanted use tesseract take output txt it. cant figure out how it.
here code:
#!/usr/bin/perl -x ########## $user = ''; # enter username here $pass = ''; # enter password here ########### # server settings (no need modify) $home = "http://37.48.90.31"; $url = "$home/c/test.cgi?u=$user&p=$pass"; # html code $html = `get "$url"`; #### add code here: # grab img html code if ($html =~ /\img[^>]* src=\"([^\"]*)\"[^>]*/) { $takeimg = $1; } @dirs = split m!/!, $takeimg; $img = $dirs[2]; ######### die "<img> not found\n" if (!$img); # download img server (save as: ocr_me.img) print "get '$img' > ocr_me.img\n"; system "get '$img' > ocr_me.img"; #### add code here: # run ocr (using shell command tesseract) on img , save text ocr_result.txt system ("tesseract", "tesseract ocr_me.img ocr_result"); ########### die "ocr_result.txt not found\n" if (!-e "ocr_result.txt"); # check ocr results: $txt = `cat ocr_result.txt`;
i took image right html or need regex? , how display 'ocr_result.txt'
thanks help!
#!/usr/bin/perl -x use lwp::simple; ########## $user = ''; # enter username here $pass = ''; # enter password here ########### # server settings (no need modify) $home = "http://37.48.90.31"; $url = "$home/c/test.cgi?u=$user&p=$pass"; # html code $html = get($url); #### add code here: # grab img html code if ($html =~ /\img[^>]* src=\"([^\"]*)\"[^>]*/) { $takeimg = $1 @dirs = split('/', $takeimg); $img = $dirs[2] or die "<img> not found\n"; # download img server (save as: ocr_me.img) getstore($img,'ocr_me.img'); #### add code here: # run ocr (using shell command tesseract) on img , save text ocr_result.txt system ("tesseract", "tesseract ocr_me.img ocr_result"); ########### die "ocr_result.txt not found\n" if (!-e "ocr_result.txt"); # check ocr results: open(fh, '<ocr_result.txt'); print "$_\n" for(<fh>); } else { print "image not found\n"; }
Comments
Post a Comment