1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Other Scanning Old Magazines - a Worklog

Discussion in 'Photography, Art & Design' started by Gareth Halfacree, 3 Mar 2025.

  1. Gareth Halfacree

    Gareth Halfacree WIIGII! Lover of bit-tech Administrator Super Moderator Moderator

    Joined:
    4 Dec 2007
    Posts:
    17,906
    Likes Received:
    7,872
    Yeah, I've got... a lot of PDFs.

    Code:
    2.8G    ./2600 Digest
    1.7G    ./6502 Micro Journal
    1.4G    ./68 Micro Journal
    5.4G    ./80 Micro
    7.7G    ./80 Microcomputing
    2.6G    ./80 US Magazine
    534M    ./8000 Plus/text
    2.3G    ./8000 Plus
    989M    ./AC's Guide to the Commodore Amiga
    285M    ./AC's Tech for the Commodore Amiga
    4.4G    ./ACE Magazine
    163M    ./Acorn Programs
    19G    ./Acorn User
    2.7G    ./Ahoy
    5.9G    ./Amazing Computing
    6.8G    ./Amiga Computing
    1.2G    ./Amiga Force
    1.4G    ./Amiga Format/text
    15G    ./Amiga Format
    296M    ./Amiga Informer
    175M    ./Amiga Shopper/text
    6.1G    ./Amiga Shopper
    6.8G    ./Amiga User International
    6.1G    ./Amiga World
    8.0G    ./Analog Computing
    151M    ./Analog Science Fiction and Fact
    247M    ./Antic's Amiga Plus
    1.2G    ./Apple 2000
    1.5G    ./Argos
    32M    ./Atari Computing UK
    306M    ./Atari Connection
    34M    ./Atari Corporate Magazines
    402M    ./Atari Explorer
    1.9G    ./Atari ST User
    95M    ./Atari User UK
    898M    ./BBS Magazine
    9.0G    ./BYTE
    184M    ./Beer Magazine
    78M    ./Big K
    43M    ./Bite Size Zines
    141M    ./Blip
    748M    ./CDi Magazine
    132M    ./COMPUTE!'s PC Magazine
    51M    ./CPC Attack
    9.5G    ./CUAmiga
    4.7G    ./CVG
    35M    ./Catalogues
    3.2G    ./Color Computer Magazine
    603M    ./Commander Magazine
    281M    ./Commodore Diehard
    1.2G    ./Commodore Disk User
    1.2G    ./Commodore Format
    718M    ./Commodore Horizons
    2.3G    ./Commodore Magazine
    1.8G    ./Commodore Microcomputers
    1.1G    ./Commodore Power Play
    14G    ./Commodore User
    479M    ./Commodore World
    1.3G    ./Compuserve Magazine
    18M    ./Compute Gazette/text
    4.6G    ./Compute Gazette
    513M    ./Compute's Amiga Resource
    52M    ./Computer Age
    1.8G    ./Computer Gamer
    12G    ./Computer Gaming World
    515M    ./Computer Language
    34M    ./Computer Monthly
    1.0G    ./Computer Notes
    1.2G    ./Computer Paper
    519M    ./Computer Play
    1.6G    ./Computer Shopper
    5.3G    ./Computers and Electronics
    1.6G    ./Computist
    19G    ./Creative Computing
    1.9M    ./Ctrl-ZINE
    27M    ./Custom PC/Books
    1.7G    ./Custom PC
    2.4G    ./Dragon Magazine Archive
    920K    ./Dragon Stop Press/text
    15M    ./Dragon Stop Press
    1.5G    ./Dragon User
    66M    ./Easy PC
    11G    ./Edge
    173M    ./Elbug
    517M    ./Enter
    4.3G    ./Family Computing
    9.5G    ./Galaxy Magazine
    919M    ./Games Computing
    1.1G    ./GamesMaster
    171M    ./H and E Computronics
    312M    ./HackSpace Magazine/HackSpace Books
    3.9G    ./HackSpace Magazine
    379M    ./Hardcore Apple Magazine
    152M    ./Home Computer Course
    955M    ./Home Computer Magazine
    1.3G    ./Home Computing Weekly
    44M    ./IEEE Annals of the History of Computing
    2.5G    ./Info
    834M    ./Infocom
    7.7G    ./Infoworld
    662M    ./Input
    6.8G    ./Interface Age
    19M    ./Interzone/Extras
    747M    ./Interzone
    108M    ./Just Magazines Computer Market
    239M    ./K-Power
    3.7G    ./Kay-Profiles
    11G    ./Kilobaud
    207M    ./Laserbug
    1.1G    ./Let's Compute
    22G    ./MacWorld
    965M    ./MagPi/MagPi Books
    4.8G    ./MagPi
    2.8G    ./Make
    228M    ./Manuals
    173M    ./Micro Adventurer
    158M    ./Micro Cornucopia
    2.0G    ./Micro Magazine
    3.1G    ./Micro User
    301M    ./Microtimes
    991M    ./Midnite Software Gazette
    263M    ./NODE
    84M    ./Neo Geo Artwork
    2.4G    ./New Computer Express
    150M    ./New Scientist
    822M    ./No Starch Press
    134M    ./O'Reilly
    625M    ./Official Dreamcast Mag
    27M    ./Old Computr
    1003M    ./Omni
    21M    ./PC Answers
    5.0G    ./PC Magazine
    2.6G    ./PC Review
    6.4G    ./PC Zone
    2.4G    ./People's Computer Company
    24G    ./Personal Computer World
    11G    ./Personal Computing Magazine
    69M    ./Phrack
    22M    ./Pixel Addict
    456M    ./Popular Computing Weekly
    3.3G    ./Popular Electronics
    7.8M    ./Practical Electronics
    188M    ./Professional Amiga
    247M    ./RHS The Garden
    4.6G    ./Run Magazine
    376M    ./ST Action
    1.4G    ./ST Amiga Format
    2.2G    ./Scientific American
    4.4G    ./Sinclair User
    5.6G    ./Softalk Apple
    633M    ./Softline
    2.8G    ./Softside
    1018M    ./Sync Magazine
    546M    ./TPUG Newsletter
    771M    ./TRS-80 Microcomputer News
    17G    ./Ted Nelson's Junk Mail
    73M    ./Texas Instruments Magazines
    22M    ./The Classic Adventurer
    712K    ./The One/text
    5.7G    ./The One
    88M    ./Tom Lehrer Music
    58M    ./Total Amiga
    1.3G    ./Total Magazine
    102M    ./Transactor Anthology
    1.6G    ./Transactor
    128M    ./Under Color Magazine
    1.5G    ./Unix Review
    6.6G    ./Whole Earth Catalog
    112M    ./Wireframe/Wireframe Books
    1.8G    ./Wireframe
    565M    ./Your 64
    10G    ./Your Commodore
    7.1G    ./Your Computer
    2.6G    ./Your Sinclair
    515M    ./Your Spectrum
    1.3G    ./ZX Computing Magazine
    3.3G    ./Zero
    906M    ./Zzap64/text
    3.6G    ./Zzap64
    114M    ./micromag
    8.4G    ./origin
    362M    ./Computer Shopper UK
    3.6G    ./Computer Shopper US
    444M    ./Cursor
    4.5G    ./Upper and Lower Case
    6.5G    ./Dr. Dobb's Journal
    33M    ./Between the Scanlines
    44M    ./Paged Out
    79M    ./Silicon Valley Engineer
    318M    ./Electronics In Action
    827M    ./PC Pro
    501G    .
    501G    total
    That doesn't include any of the work-in-progress stuff, nor my book collection...
     
    Byron C likes this.
  2. sandys

    sandys Multimodder

    Joined:
    26 Mar 2006
    Posts:
    5,199
    Likes Received:
    895
    Are you able to bring the GPU into service for some of this rather than more CPU?
     
  3. Byron C

    Byron C I was told there would be cheesecake…?

    Joined:
    12 Apr 2002
    Posts:
    11,027
    Likes Received:
    5,711
    Indeed it appears that your work is done :happy:

    [​IMG]
     
    wyx087 and Gareth Halfacree like this.
  4. Gareth Halfacree

    Gareth Halfacree WIIGII! Lover of bit-tech Administrator Super Moderator Moderator

    Joined:
    4 Dec 2007
    Posts:
    17,906
    Likes Received:
    7,872
    Sadly not: none of the tools I'm using support GPGPU offload.

    EDIT:
    Well, that's not strictly true: Tesseract OCR *used* to have OpenCL support, but it never really worked properly and nobody wanted to take it on so it got removed. ImageMagick *still* has OpenCL support, but it doesn't accelerate any of the operations I'm using (since dropping -contrast, anyway)... and even if it did, it explicitly doesn't support Nvidia cards.
     
    Last edited: 4 Apr 2025
  5. Gareth Halfacree

    Gareth Halfacree WIIGII! Lover of bit-tech Administrator Super Moderator Moderator

    Joined:
    4 Dec 2007
    Posts:
    17,906
    Likes Received:
    7,872
    I'll tell you what I hadn't tried, though: upgrading Tesseract. I'm on the version that's in the Ubuntu 20.04 (yes, yes, I should upgrade) repos, which is 4.1.1. Works fine, no problems, and it's the one where they added the machine-learning mode (LSTM), so a big jump over Tesseract OCR 3.

    But the current development version is Tesseract OCR 5. And there's a PPA. And it includes 20.04-compatible builds. And if it goes horribly, horribly wrong I can purge the PPA and go back to 4.1.1.

    Worth a shot!

    Snagged a random 100 pages from Issue 47 and processed them, as my sample set for benchmarking. Ran Tesseract 4 across 'em: 1m8s.

    Upgraded to Tesseract 5, ran it across the same files: 36.5s. Yeah, about twice as fast. That's... pretty impressive. Given that the OCR stage is the slowest part of the process (after how long Simple Scan takes to save the pages as PNGs), I'm happy with that.

    EDIT:
    Did a full-mag pass. From this:
    To 5m44s (only the PDF creation phase, the image processing step is unchanged). That's a big win.
     
    Last edited: 4 Apr 2025
    IanW, Byron C and wyx087 like this.
  6. Gareth Halfacree

    Gareth Halfacree WIIGII! Lover of bit-tech Administrator Super Moderator Moderator

    Joined:
    4 Dec 2007
    Posts:
    17,906
    Likes Received:
    7,872
    Had a dig this morning to see if there's any software option out there to speed up writing the PNGs from Simple Scan, which would shave another few minutes off the whole process. Turns out there are libpng alternatives that promise major speed gains (up to 12x, for one)... but they're not drop-in things, you have to build your software around them. There are also discussions about libpng itself switching to a new compression library (zlib-ng) which would deliver a 50-100% boost, but no concrete progress yet.

    So, that means it's Throw Money At The Problem Time again! Picked up (hopefully) a 5900X on the Marketplace, which should slot into my existing motherboard (again, hopefully: AMD's site says you need a 500-series, but Asus says my B450 has supported it since the last BIOS update) and run with my existing Wraith Thingumybobby cooler (again again, hopefully: both the 5900X and my 2700X are 105W TDP, so it should work just as well). 12C24T instead of 8C16T, so an instant 50% boost to the number of workers I can run - and a 4.8GHz boost to 4.3GHz (though all-core will be lower - my 2700X sits at 4GHz all-core with the Wraith set to quiet mode), plus all the benefits of Zen 3 over Zen+.

    It'll be interesting to do a few benchmarks!
     
    Byron C likes this.
  7. Gareth Halfacree

    Gareth Halfacree WIIGII! Lover of bit-tech Administrator Super Moderator Moderator

    Joined:
    4 Dec 2007
    Posts:
    17,906
    Likes Received:
    7,872
    New CPU get, and installed along with the new cooler!

    Was crashing on me a little bit, but seems happier now I've reset the BIOS to defaults. DOCP profile now runs the RAM at 3,600MHz, which is nice, and the cooler remains inaudible over the rest of the fans even running all the cores at full load - 4.4GHz (dipping to 4.3GHz, up from the 4GHz the 2700X could do), maximum temperature of 72°C (Tctl, the highest of all the sensor readings) and the fans are at about 40%? Happy with that.

    As for performance? Ran a quick ImageMagick test to see: 4m11s, down from seven minutes. Happy with that!
     
  8. yuusou

    yuusou Multimodder

    Joined:
    5 Nov 2006
    Posts:
    3,153
    Likes Received:
    1,225
    Do take some time to look at Curve Optimizer in the bios. It'll be under PBO, which will probably be under overclocking.
    By using a negative curve, you're essentially telling the system it can use less voltage for a certain frequency.
    This means it may even sustain the 4.4GHz at a cooler temperature.
    Typically most Ryzen CPUs should be able to do -20 (-0.1V), some should do -25 (-0.125V), a golden sample can do -30 (-0.15V).
     
  9. Gareth Halfacree

    Gareth Halfacree WIIGII! Lover of bit-tech Administrator Super Moderator Moderator

    Joined:
    4 Dec 2007
    Posts:
    17,906
    Likes Received:
    7,872
    I've got PBO ("auto" at the moment) but no curve optimiser. My motherboard may be too old and/or cheap for that. (Asus Prime B450M-K... Or K-M, I forget.)
     
  10. yuusou

    yuusou Multimodder

    Joined:
    5 Nov 2006
    Posts:
    3,153
    Likes Received:
    1,225
    It'll be under "Advanced" instead of "Auto".
     
    Gareth Halfacree likes this.
  11. Gareth Halfacree

    Gareth Halfacree WIIGII! Lover of bit-tech Administrator Super Moderator Moderator

    Joined:
    4 Dec 2007
    Posts:
    17,906
    Likes Received:
    7,872
    Well, I dunno what parts of the CPU Tesseract exercises... but it does not like sharing those resources. I'm doing a full-mag run as a test, and I was seeing the same thing the whole way through - 4.4-4.3GHz boost, 70°C temperature or so - until Tesseract got involved. Then the temperature was lower, but the boost clock didn't go above 4.1GHz... and even though the temperature never went above 68°C the CPU had some kind of panic and dropped to 500MHz for a bit(!). Happened four or five times in the run, only for a few seconds but... yeah.

    May need to experiment, see if fewer Tesseract workers would be faster than a full 24.

    EDIT:
    Oh! The numbers: 4m12s for the PNG processing, 3m43s to generate the PDF - down from 5m44 on the old CPU. I can now process the PNGs and generate the PDF in less time than processing the PNGs alone, which is nice.

    EDIT EDIT:
    Yeah, I dunno if this is a Ryzen 5000-series thing or a Tesseract OCR 5 thing, but it does not like running on virtual cores. If I limit Tesseract to 12 workers - and only Tesseract, leaving the other stages at 24 - the CPU remains at 4.4-4.3GHz, no weird dips, and the job finishes in 3m41s - two seconds less than with 24 workers. Go figure!

    EDIT EDIT EDIT:
    12 workers, but each worker is allowed two threads: full boost clock, no weird dips, total time drops to 4m1s. Guess that's the fix!

    Wait, no, I'm focusing on the seconds and ignoring the minutes: it's slower than 12 single-threaded workers. Right, fine, 12 single-threaded workers, then!
     
    Last edited: 9 Apr 2025
  12. Gareth Halfacree

    Gareth Halfacree WIIGII! Lover of bit-tech Administrator Super Moderator Moderator

    Joined:
    4 Dec 2007
    Posts:
    17,906
    Likes Received:
    7,872
    Okay, so, it's not Tesseract doing something weird - or it is, but it's 'cos Tesseract is a hungry hungry hippo of a workload. Turns out that my CPU is cool as you like, but the VRMs on my bargain-basement board? The ones with no heatsinks? The ones that were previously under the downward-pointing fan of the Wraithy Thingy cooler, and now have no direct airflow? Yeah, they're the problem.

    Turned up the front intake fans - they're on a manual rheostat that came with the case - and suddenly I can run 24 Tesseract workers without seeing panic-drops to 547MHz. It's not hitting the full 4.4GHz all-core boost - it does for the other stages, but for Tesseract it's stuck at 4.1GHz - but there's a real performance gain: 12 workers takes 3m43s, 24 takes 3m8s.

    If I can get the VRMs even cooler, maybe the CPU will boost higher and I can get it sub-three-minutes...
     
  13. sandys

    sandys Multimodder

    Joined:
    26 Mar 2006
    Posts:
    5,199
    Likes Received:
    895
    Great that cooling is doing it for you, would be a bit disappointing to have dropped a chip in and not be able to use it, winding back of clocks at full load is normal, you'll be hitting one of the limits, Temp, PPT, TDP, EDC etc Ryzen master gives you nice GUI to watch this stuff and open some limits if you have scope, I'm sure there will be a similar Linux utility out there that can point you in the right direction, though perhaps given the board situation maybe its best left alone.

    Or you could push your luck, cool the VRMs and try a fixed OC of 4.4-4.6 and see if gains are worth it at the sacrifice of low load boost.

    Example guide.

    Overclock The AMD Ryzen 9 5900X to ALL CORE 4.6GHz with the MEG B550 UNIFY-X

    I wouldn't be able to help myself I'd be trying all the things, lowering latency, upping RAM speed etc, I even did this with my APUs on a similarly poor B450 board, I have an OC problem :D
     
    Last edited: 10 Apr 2025
    Gareth Halfacree likes this.
  14. yuusou

    yuusou Multimodder

    Joined:
    5 Nov 2006
    Posts:
    3,153
    Likes Received:
    1,225
    Could always bodge on a custom heatsink held in place with some sticky thermal pads.
     
  15. Gareth Halfacree

    Gareth Halfacree WIIGII! Lover of bit-tech Administrator Super Moderator Moderator

    Joined:
    4 Dec 2007
    Posts:
    17,906
    Likes Received:
    7,872

Share This Page