Comment by ๐ธ bluesman
I worked out something that covers the scenarios I've encountered. Streaming directly off the wire into the renderer has made things more difficult than they otherwise would have been. Thanks for the advice.
2025-03-24 ยท 1 year ago
4 Later Comments โ
๐ clseibold [๐] ยท 2025-03-24 at 05:56:
@bluesman Oh, for being able to render from the stream, the mimetype detector I use in my golang client will only read a certain number of bytes at the beginning of the file, so I just buffer that for the mimetype detector, and then basically place it back into a multireader for the parser/renderer.
So, I read the number of bytes required for the mimetype detector from the connection into a buffer, do the miemtype detection, and then place the buffer along with the connection into a multireader.
The multireader is used by the parser and renderer, and what happens is it streams from the buffer used for the mimetype detector first until it reaches the end, and then automatically starts streaming from the connection again.
Golang makes this pretty easy, but I'm sure there's a similar solution for other languages (I know zig has multireaders).
Sidenode: Additionally, between the detection of the mimetype and the streaming of the multireader, I also introduce another reader that, when read from (I'm wrapping the multireader in this new reader), will read from the multireader and auto-convert from the encoding of the response to UTF-8, which is another thing golang makes really easy to do, and I believe is in its stdlib. I only do this for text mimetypes, of course. This is just another thing you may want to also consider.
Readers and multireaders are very powerful when you get the hang of them, and they make these types of things so much easier to do, imo :D
๐ธ bluesman [OP] ยท 2025-03-24 at 15:37:
That makes sense and sounds pretty cool. Right now, I'm examining the bytes beyond the response line in the first buffer to see if they're utf-8. Yes, I guess it's possible the next buffer off the wire may contain binary somehow. It's also possible that the first buffer ONLY contains the response line (I saw this with the spartan echo service). Accounting for that would mean even more buffering before I start handing off to the renderer. What I have seems reasonable.
Your comment on the reader auto-converting to UTF-8 had me wondering. I looked at the code and the vert.x methods I'm using convert to UTF-8 anyway. Good thing that's what I want. Right?
๐ clseibold [๐] ยท 2025-03-24 at 15:40:
@bluesman I should clarify that my mimetype detection can also detect the encoding, and so my reader that auto-converts to UTF-8 will take the encoding as a parameter so that it knows what the data is in.
Also, I read the header before creating the reader to detect the mimetype of the data, so only the data part is actually getting buffered.
๐ธ bluesman [OP] ยท 2025-03-25 at 04:17:
Of course, I'm seeing differences in received buffer sizes on Mac OS compared to Windows. For the text docs previously mentioned, I typically only get the response header in the first buffer (but only on Windows). That means Alhena can't test for text and therefore has to "trust" the mime type. I made a change to buffer more and that fixes the issue even though it worked fine on Mac OS to begin with. It seems like it would be better if the server would just send the correct type.
Original Post
Mime Type โ I added Spartan support to Alhena. Just when I thought I was done, I was testing some links in the ufo section at mozz.us. For text files with a non-text extension, the server sends "application/octet-stream" as the mime type. Alhena sees that and treats those as binary, presenting a download dialog. Lagrange either ignores the mime type or, more likely, determines the file is text anyway and displays the contents. Any thoughts on how to handle this scenario? In my mind, how it's...