if you could standardise a file format for a specific task what would you pick and why

jackpot@lemmy.ml · edit-2 1 year ago

if you could standardise a file format for a specific task what would you pick and why

palordrolap@kbin.social · 1 year ago

Just going to leave this xkcd comic here.

Yes, you already know what it is.

taladar@sh.itjust.works · 1 year ago

One could say it is the standard comic for these kinds of discussions.

DigitalJacobin@lemmy.ml · edit-2 1 year ago

This is the kind of thing i think about all the time so i have a few.

Archive files: .tar.zst
- Produces better compression ratios than the DEFLATE compression algorithm (used by .zip and gzip/.gz) and does so faster.
- By separating the jobs of archiving (.tar), compressing (.zst), and (if you so choose) encrypting (.gpg), .tar.zst follows the Unix philosophy of “Make each program do one thing well.”.
- .tar.xz is also very good and seems more popular (probably since it was released 6 years earlier in 2009), but, when tuned to it’s maximum compression level, .tar.zst can achieve a compression ratio pretty close to LZMA (used by .tar.xz and .7z) and do it faster^[1].
  
  zstd and xz trade blows in their compression ratio. Recompressing all packages to zstd with our options yields a total ~0.8% increase in package size on all of our packages combined, but the decompression time for all packages saw a ~1300% speedup.
Image files: JPEG XL/.jxl
- “Why JPEG XL”
- Free and open format.
- Can handle lossy images, lossless images, images with transparency, images with layers, and animated images, giving it the potential of being a universal image format.
- Much better quality and compression efficiency than current lossy and lossless image formats (.jpeg, .png, .gif).
- Produces much smaller files for lossless images than AVIF^[2]
- Supports much larger resolutions than AVIF’s 9-megapixel limit (important for lossless images).
- Supports up to 24-bit color depth, much more than AVIF’s 12-bit color depth limit (which, to be fair, is probably good enough).
Videos (Codec): AV1
- Free and open format.
- Much more efficient than x264 (used by .mp4) and VP9^[3].
Documents: OpenDocument / ODF / .odt
- @raubarno@lemmy.ml says it best here. .odt is simply a better standard than .docx.
it’s already a NATO standard for documents Because the Microsoft Word ones (.doc, .docx) are unusable outside the Microsoft Office ecosystem. I feel outraged every time I need to edit .docx file because it breaks the layout easily. And some older .doc files cannot even work with Microsoft Word.

lloram239@feddit.de · edit-2 1 year ago

.tar is pretty bad as it lacks in index, making it impossible to quickly seek around in the file. The compression on top adds another layer of complication. It might still work great as tape archiver, but for sending files around the Internet it is quite horrible. It’s really just getting dragged around for cargo cult reasons, not because it’s good at the job it is doing.

In general I find the archive situation a little annoying, as archives are largely completely unnecessary, that’s what we have directories for. But directories don’t exist as far as HTML is concerned and only single files can be downloaded easily. So everything has to get packed and unpacked again, for absolutely no reason. It’s a job computers should handle transparently in the background, not an explicit user action.

Many file managers try to add support for .zip and allow you to go into them like it is a folder, but that abstraction is always quite leaky and never as smooth as it should be.

jackpot@lemmy.ml · 1 year ago

is av1 lossy

DigitalJacobin@lemmy.ml · 1 year ago

AV1 can do lossy video as well as lossless video.

𝕽𝖚𝖆𝖎𝖉𝖍𝖗𝖎𝖌𝖍@midwest.social · 1 year ago

Resume information. There have been several attempts, but none have become an accepted standard.

When I was a consultant, this was the one standard I longed for the most. A data file where I could put all of my information, and then filter and format it for each application. But ultimately, I wanted to be able to submit the information in a standardised format - without having to re-enter it endlessly into crappy web forms.

I think things have gotten better today, but at the cost of a reliance on a monopoly (LinkedIn). And I’m not still in that sort of job market. But I think that desire was so strong it’ll last me until I’m in my grave.

Elise@beehaw.org · edit-2 1 year ago

I wish there was a more standardized open format for documents. And more people and software should use markdown/.md because you just don’t need anything fancier for most types of documents.

raubarno@lemmy.ml · 1 year ago

Open Document Standard (.odt) for all documents. In all public institutions (it’s already a NATO standard for documents).

Because the Microsoft Word ones (.doc, .docx) are unusable outside the Microsoft Office ecosystem. I feel outraged every time I need to edit .docx file because it breaks the layout easily. And some older .doc files cannot even work with Microsoft Word.

Actually, IMHO, there should be some better alternative to .odt as well. Something more out of a declarative/scripted fashion like LaTeX but still WYSIWYG. LaTeX (and XeTeX, for my use cases) is too messy for me to work with, especially when a package is Byzantine. And it can be non-reproducible if I share/reuse the same document somewhere else.

Something has to be made with document files.

monobot@lemmy.ml · 1 year ago

It is unbelievable we do not have standard document format.

DigitalJacobin@lemmy.ml · 1 year ago

What’s messed up is that, technically, we do. Originally, OpenDocument was the ISO standard document format. But then, baffling everyone, Microsoft got the ISO to also have .docx as an ISO standard. So now we have 2 competing document standards, the second of which is simply worse.

Diana@discuss.tchncs.de · 2 months ago

MKV It supports high-quality video and audio codecs, allowing for lossless compression and high-definition content. Also MKV supports chapter and menu functionality, making it suitable to rip DVD to MKV and store DVDs and Blu-ray discs.

Björn Tantau@swg-empire.de · edit-2 1 year ago

zip or 7z for compressed archives. I hate that for some reason rar has become the defacto standard for piracy. It’s just so bad.

The other day I saw a tar.gz containing a multipart-rar which contained an iso which contained a compressed bin file with an exe to decompress it. Soooo unnecessary.

Edit: And the decompressed game of course has all of its compressed assets in renamed zip files.

KSP Atlas@sopuli.xyz · 1 year ago

.tar.xz masterrace

d_k_bo@feddit.de · 3 months ago

This comment didn’t age well.

AlexWIWA@lemmy.ml · 1 year ago

Markdown for all rich text that doesn’t need super fancy shit like latex

lloram239@feddit.de · edit-2 1 year ago

I’d setup a working group to invent something new. Many of our current formats are stuck in the past, e.g. PDF or ODF are still emulating paper, even so everybody keeps reading them on a screen. What I want to see is a standard document format that is build for the modern day Internet, with editing and publishing in mind. HTML ain’t it, as that can’t handle editing well or long form documents, EPUB isn’t supported by browsers, Markdown lacks a lot of features, etc. And than you have things like Google Docs, which are Internet aware, editable, shareable, but also completely proprietary and lock you into the Google ecosystem.

mexicancartel@lemmy.dbzer0.com · 1 year ago

Epub isn’t supported by browsers

So you want EPUB support in browser and you have the ultimate document file format?

drwankingstein@lemmy.dbzer0.com · 1 year ago

matroska for media, we already have MKA for audio and MKV for video. An image container would be good too.

mp4 is more prone to data loss and slower to parse, while also being less flexible, despite this it seems to be a sort of pseudo standard.

(MP4, M4A, HEIF formats like heic, avif)

sunbeam60@lemmy.one · 1 year ago

SQLite for all “I’m going to write my own binary format because I is haxor” jobs.

There are some specific cases where SQLite isn’t appropriate (streaming). But broadly it fits in 99% of cases.

barrett9h@lemmy.one · 1 year ago

192 kHz for music.

The CD was the worst thing to happen in the history of audio. 44 (or 48) kHz is awful, and it is still prevalent. It would be better to wait a few more years and have better quality.

Supermariofan67@programming.dev · 1 year ago

Why? What reason could there possibly be to store frequencies as high as 96 kHz? The limit of human hearing is 20 kHz, hence why 44.1 and 48 kHz sample rates are used

bellsDoSing@lemm.ee · 1 year ago

On top of that, 20 kHz is quite the theoretical upper limit.

Most people, be it due to aging (affects all of us) or due to behaviour (some way more than others), can’t hear that far up anyway. Most people would be suprised how high up even e.g. 17 kHz is. Sounds a lot closer to very high pitched “hissing” or “shimmer”, not something that’s considered “tonal”.

So yeah, saying “oh no, let me have my precious 30 kHz” really is questionable.

At least when it comes to listening to finished music files. The validity of higher sampling frequencies during various stages in the audio production process is a different, way less questionable topic,

christophski@feddit.uk · 1 year ago

That is not what 96khz means. It doesn’t just mean it can store frequencies up to that frequency, it means that there are 96,000 samples every second, so you capture more detail in the waveform.

Having said that I’ll give anyone £1m if they can tell the difference between 48khz and 96khz. 96khz and 192khz should absolutely be used for capture but are absolutely not needed for playback.

IsoKiero@sopuli.xyz · 1 year ago

I don’t know what to pick, but something else than PDF for the task of transferring documents between multiple systems. And yes, I know, PDF has it’s strengths and there’s a reason why it’s so widely used, but it doesn’t mean I have to like it.

Additionally all proprietary formats, specially ones who have gained enough users so that they’re treated like a standard or requirement if you want to work with X.

StarkillerX42@lemmy.ml · 1 year ago

I would be fine with PDFs exactly the same except Adobe doesn’t exist and neither does Acrobat.

Supermariofan67@programming.dev · 1 year ago

Ogg Opus for all lossy audio compression (mp3 needs to die)

7z or tar.zst for general purpose compression (zip and rar need to die)

Aatube@kbin.social · 1 year ago

What’s wrong with mp3

Knusper@feddit.de · 1 year ago

Big file size for rather bad audio quality.

TheAnonymouseJoker@lemmy.ml · edit-2 3 months ago

Removed by mod

MonkderZweite@feddit.ch · 1 year ago

How are you going to recreate the MP3 audio artifacts that give a lot of music its originality, when encoding to OPUS?

Oh, a gramophone user.

Joke aside, i find ogg Opus often sounding better than the original. Probably something with it’s psychoacoustic optimizations.

TheAnonymouseJoker@lemmy.ml · edit-2 3 months ago

Removed by mod

neo (he/him)@lemmy.comfysnug.space · 1 year ago

.opus for lossy music, .flac for lossless music, .png for image files, .mkv for video

mexicancartel@lemmy.dbzer0.com · 1 year ago

jxl for images, vp9 for video, ogg vorbis for lossy audio and flac for loseless

ProgrammingSocks@pawb.social · 1 year ago

Opus is the successor to Vorbis. It’s superior in terms of quality to bitrate for all bitrates, and it’s made by the same organization.

nik0@lemm.ee · 1 year ago

whats the difference between opus and 320 mp3?

Spore@lemmy.ml · edit-2 1 year ago

The point is 140kbps opus is almost identical to 320kbps mp3 for human so it saves over 50% size for the same quality. Also it’s a royalty free format.

neo (he/him)@lemmy.comfysnug.space · 1 year ago

opus is higher quality at a much lower bitrate, meaning you can definitely store more songs in the opus format than in 320 mp3. opus can be constant bit rate or variable bit rate, whichever you prefer at encode time

Kait Richardshroom 🍄🏳️‍⚧️@calckey.world · 1 year ago

@neo@lemmy.comfysnug.space I thought opus was voice optimized?

Kait Richardshroom 🍄🏳️‍⚧️@mastodon.social · 1 year ago

@neo I thought opus was voice optimized?

neo (he/him)@lemmy.comfysnug.space · 11 months ago

It is, but it’s not the only thing opus is optimized for

BehindTheBarrier@programming.dev · 1 year ago

All of them are OK, except mkv is less a file type and more a container. What should be specified is the code for video, which for most things I’d say AV1, but high res movies might not be the most suitable. Throw in opus for the audio track, and you can use mkv, but might as well use webm anyways since it’s more clear what’s behind it. (though can still be other things)

I’d also add that jxl should be the standard for lossy images. Better than jpg. And you want something other than png for massive images because that quickly gets costly in terms of size due to png being lossless.

SnipingNinja@slrpnk.net · 1 year ago

Unpopular opinion but webp isn’t bad it just needs wider support, but maybe I’m unaware of its actual shortcomings in which case please educate.

Also I wonder if it’s possible to have a single image format for all those uses but also RAW?

BehindTheBarrier@programming.dev · edit-2 1 year ago

Here’s a little article which highlights jxl well. https://chipsandcheese.com/2021/02/28/modern-data-compression-in-2021-part-2-the-battle-to-dethrone-jpeg-with-jpeg-xl-avif-and-webp/

I do not think it’s mentioned there, but I think webp and also it’s indirect successor avif afaik, both lack progressive loading which is not optimal for website loading. It’s has incremental loading which I think is akin the the old dial up time of loading top to bottom row for row. They proclaim progressive decoding is costly on memory and cpu, but progressive gives the best user experience imo.

Lastly a fringe issue, re-encooding multiple times. The good old reason why jpgs turn into trash over time because people encode instead of save images. Or because sites re-encode when uploading. Jxl wins here. It also is very easy to see why jpg turns into what it does rather quickly.

https://www.reddit.com/r/AV1/comments/ju18pz/generation_loss_comparing_jpeg_webp_jxl_and_avif/