Commons:Bots/Requests/ImageConverterBot

Operator: DaxServer (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought: Convert TIFF files to JPEG files and link both. As requested at Convert Commons:Bots/Work requests § Category:Photographs by Carol M. Highsmith to JPEG. The TIFF files at Category:Photographs by Carol M. Highsmith are [recursively] loaded into the bot and converted to JPEG using Wand, a Python binding for ImageMagick. The Exif metadata is copied over using PyExifTool, a Python binding for ExifTool by Phil Harvey. The metadata groups that are being copied over, that I've discovered so far, are: Author, Camera, Composite, ExifIFD, GPS, ICC_Profile, IFD0, IPTC, Location and XMP-crs. The entire metadata can be copied indiscriminately if that is preferred rather a selection. The new JPEG file will have the same wikitext as the TIFF file, with an addition of {{{other_versions}}} gallery and but a removal of categories such as Uploaded by xyz user as it will be retained in the TIFF file page. The TIFF file page is edited with a link to the JPEG in the gallery and all the categories are removed with the addition of Category:LC TIF images with categorized JPGs. If duplicates are found, using the checksum, the page is skipped over and marked for manual verification and linking using gallery. The OpenCV strategy as described at User:Fæ/LOC#Housekeeping is rather out of my reach. The bot is being written using Pywikibot and is intended to run on Toolforge.

Automatic or manually assisted: Automatic

Edit type (e.g. Continuous, daily, one time run): Continuous

Maximum edit rate (e.g. edits per minute): 5

Bot flag requested: (Y/N): Y

Programming language(s): Python (Pywikibot)

-- DaxServer (talk) 15:07, 1 July 2024 (UTC)[reply]

Discussion
I'm not able to understand the issue we are trying to solve. All previews of these gigantic TIFFs load just fine for me (in under 2 seconds). I do not expencience much difference as compared to JPEGs. --Schlurcher (talk) 14:18, 2 July 2024 (UTC)[reply]
  On hold for the discussion linked -- DaxServer (talk) 08:58, 4 July 2024 (UTC)[reply]
TIF format is an archive format which is simply not suitable for web use, for example TIF file previews look much worse than JPG when used in Wikipedia articles, them being "lossless" dosen't improve the actually displayed quality, it is made worse. "Freely usable media" also means not needing to have very fast internet connections, or needing special programs to edit the files. Another random example: Free email clients allow only very limited attachment sizes (GMail 25MB for example), and sending one document which includes an 100MB TIF image would not be possible for the average person, who has no clue about file formats. TheImaCow (talk) 11:58, 24 July 2024 (UTC)[reply]
The argument appears to be in line Commons:File types. Given that, should we convert all TIFF files, replace their usage, and delete the TIFFs? Krd 07:34, 27 July 2024 (UTC)[reply]
I would generally support that, but I am sure that this would need wider consensus. (as it would also affect these 200k images) TheImaCow (talk) 21:13, 28 July 2024 (UTC)[reply]
I'd appreciate if anybody could start such discussion at a suitable venue. Krd 05:19, 3 August 2024 (UTC)[reply]