Massassi Forums Logo

This is the static archive of the Massassi Forums. The forums are closed indefinitely. Thanks for all the memories!

You can also download Super Old Archived Message Boards from when Massassi first started.

"View" counts are as of the day the forums were archived, and will no longer increase.

ForumsDiscussion Forum → Optical Character Flabbergastion
Optical Character Flabbergastion
2009-04-22, 12:58 AM #1
An experiment in bridging my "legacy peripheral" to my computer gone horribly awry...

[http://www.interrobling.com/images/fromtypewriter.jpg]
Analog...

[http://www.interrobling.com/images/toscanner.jpg]
...to Digital.

[http://www.interrobling.com/images/omgocr.jpg]
OMG OCR! Text searchable! Time to copy...

[http://www.interrobling.com/images/uhthanksspellcheck.jpg]
...and paste... uhh... and thanks for the info, SpellCheck.
Cordially,
Lord Tiberius Grismath
1473 for '1337' posts.
2009-04-22, 1:00 AM #2
:XD:
woot!
2009-04-22, 2:36 AM #3
Now turn on speech recognition and read the OCR output out loud.

Edit: and then translate it to Korean and back
Stuff
2009-04-22, 4:37 AM #4
want log03
2009-04-22, 8:09 AM #5
Depending on the font used OCR might not be as good.

Try increasing the contrast of the scanned image, some of the bits look faint and that would cause problems. Adjust it until you get nice dark lines for the text.

2009-04-22, 8:19 AM #6
it got kitchen right
2009-04-22, 10:49 AM #7
MZZT: http://sticklertron.com/?p=6 I came to some of the same conclusions.

kyle90: How on earth could you possibly read that output? Recorded submissions/attempts are welcome.
Cordially,
Lord Tiberius Grismath
1473 for '1337' posts.
2009-04-22, 11:19 AM #8
I tried the first paragraph.

http://kyle90.info/files/OCR.mp3
Stuff
2009-04-22, 11:39 AM #9
"I bone a cornmen" hahahaha
Cordially,
Lord Tiberius Grismath
1473 for '1337' posts.
2009-04-22, 11:45 AM #10
dang, when I read the mouseover and saw that the experiment had gone "horribly awry" I expected legacy periphial-computer-mutant-creature superhero.
If you choose not to decide, you still have made a choice.

Lassev: I guess there was something captivating in savagery, because I liked it.
2009-04-22, 11:56 AM #11
Originally posted by Sarn_Cadrill:
dang, when I read the mouseover and saw that the experiment had gone "horribly awry" I expected legacy periphial-computer-mutant-creature superhero.


2009-04-22, 12:38 PM #12
So that's why they don't make battery-operated printers. hahaha awesome video.
Cordially,
Lord Tiberius Grismath
1473 for '1337' posts.
2009-04-23, 5:17 AM #13
It looks like you tried cut/pasting the text out of a PDF -- not the best way to OCR stuff. Try a real OCR package.
And when the moment is right, I'm gonna fly a kite.
2009-04-23, 6:52 AM #14
I'll have to. Interestingly, Acrobat seemed to have a better recognition rate after successive scans.

Originally posted by Third Scan:
1 1, E: 11 )U-" 0: thp mo_"nlll{: <
_ l.t ._ I'm htl jgG~d. It'.•_: '1 ....., I v CiCcom_)li..,h, d little t L y.
;•P'l.t not ~ntii.'i?ly t JU ~; I h;,.v,? t i nally d8c ideJ. u110n my apu'tm.:nt in
'a hington ,t"l.t-•. I ve chQ sn I'he ,ucU'oy '-~.t Jj-lltOWll., a luxuL"'Y lpru:tment
carr r:l"'x bet [tlen :3plltoHn and -tu~on -'.nne cOIfu'llanding an impre..,~iv•:;: 'lieN
01 :lliott iay (anel, I ho~,~, 3.. commen~U.L'at6 allocCition of Jr"attl t'-' ",
.;ca_'c o:: •unlight). Th" .:..p~ci.l.:ic unit I have in mind boact_ a lir eplace , an in-unit \.ra.:.her-dryer , and a dislll'J"a,.3cler. hile it•.:; echnica.lly a tudio, :::, ~ 'O:epinc; ar,,:Ci is 'l.pportioned o_'f L:om th? ma:"l1 living -pac,;: ,
Community amcnit it2.• Ultimately drove my G..eci ion , 1.y iin3.1 candid1.t ~
J:1?u b_,:;:n tho ~"i!~nd:!:, in .~p"l".rtmC!nt", in th2 Univp•L'"ity .-Ii trict I ths "Jo"y
lily " 0 Canitol Hill, Tower eOl dm•mtown, and The .'i.udrey. 'londrian
1/".,.... -particularly a.r'for<.1.Plble at ',6Q 5/mo ; however, r"':;,ort.. 0 ...' c.dm"' 1.l1.(..
V"< rant l<>Lt mp ~{.oit di .co~aged. I al 0 had thp crseJin~ ::.u picion
that liVing In a. tat,., Coll"'E""'-.llk~ to'•1'1 could Of 'ur: 'Jut could :'.l 0 b
31i'::" '~ing, :1 I ..:;,..-L•t ~ ftXtlpI' y.~::' -~..:'0Ir. colI ~'2 • .l...' J,...""r
1, ~t+-'r '.:n'l= "lrt..ll'nt. =t .,. ait.'l'i.-':.. ci ic1 -: C '1 ts r"-•ij:.n.ti<: n:>iEh ,rtwcn n.' t"j"'" .::oom I •ro1.1' 1 l-}8V'-' r~ntr>rl 11:'.1 -:':~:~3.t ri"lto ',. . th ••... 0 mic y_"'in-:r=:. •' r 1_ tL1it I toul,•n -lIed. o'l tin~tl'-0_ c~t tholwh, and ~~.cT:..":' .1"l:/ 0-' th-con.~!l.i nc~ 0 ro.ol2-_~.1 l ivin: tl..,t Th'" ull .'-"y, 1.t only 1"..e OT 0 a month mor-, had, To .l:r-ClOl ,;:l.. ric", hut 'I'h '•wL,.., 11'"' .. a y~~r-:'ou.n~ in. ooL•, h ....t-c1 -=oc~ PI...'11 b•.... t-r vi 1~_ .
-~o, th•., •udr<>y h::i::> l~tm :::y in-unit and ~. larp""-" kitch~!l.
::;.
Cordially,
Lord Tiberius Grismath
1473 for '1337' posts.
2009-04-23, 8:49 AM #15
it also looks as though your scanned document is not level, that will throw off a lot of OCR scanners too. the software we use here at work scans the document right into a text editor, not a PDF. that could be part of the problem too.
TAKES HINTS JUST FINE, STILL DOESN'T CARE
2009-04-23, 8:52 AM #16
What software?
Cordially,
Lord Tiberius Grismath
1473 for '1337' posts.
2009-04-23, 9:03 AM #17
I tried using FreeOCR, which uses Tesseract...

Quote:
The Tesseract free OCR engine is an open source product released by Google. It was developed at Hewlett Packard Laboratories between 1985 and 1995. In 1995 it was one of the top 3 performers at the OCR accuracy contest organized by University of Nevada in Las Vegas. The Tesseract engine source code is now maintained by Google and the project can be found here: http://code.google.com/p/tesseract-ocr/


[http://sticklertron.com/images/ocr_fail.jpg]

No dice.

THE CHALLENGE STANDS:

[http://sticklertron.com/images/challenge.jpg]

(...and for extra credit:)
[http://sticklertron.com/images/challenge_bonus.jpg]
Cordially,
Lord Tiberius Grismath
1473 for '1337' posts.
2009-04-23, 9:36 AM #18
Fed it through my "Human Brain 1.00" OCR software. I think it did a pretty good job.

Quote:
April 22nd
The wee hours of the morning.

So it seems I'm jetlagged. It's 2:21 A.M. I've accomplished little today. That's not entirely true; I have finally decided upon my apartment in Washington state. I've chosen The Audrey at Belltown, a luxury apartment complex between Belltown and Queen Anne commanding an impressive view of Elliott Bay (and, I hope, a commensurate allocation of Seattle's scarce sunlight). The specific unit I have in mind boasts a fireplace, an in-unit washer-dryer, and a dishwasher. While it's technically a studio, a sleeping area is apportioned off from the main living space.

Community amenities ultimately drove my decision. My final candidates had been the Mondrain APartments in the University District, the "Joey Ray" on Capitol Hill, Tower 801 downtown, and The Audrey. Mondrian was particularly affordable at $695/mo; however, reports of crime and vagrants left me a bit discouraged. I also had the creeping suspicion that living in a State College-like town could be fun but could also be alienating, as I drifted further and further from college. The Joey Ray was a pretty good apartment. It was situated in a charming residential neighborhood and the room I would have rented had a great hilltop view of the "cosmic syringe." The unit I toured smelled distinctly of cat though, and lacked many of the conveniences of modern living that The Audrey, at only $128 or so a month more, had. Tower 801 was nice, but The Audrey has a year-round indoor, heated pool and better views.

Also, the Audrey has laundry in-unit and a larger kitchen.


The software would also like to remark that this was the most ****ing boring thing it has ever had to read or write.
Stuff
2009-04-23, 9:49 AM #19
hmm, OCR software having trouble with a low-resolution low-contrast scan that has compression artifacts

↑ Up to the top!