Comment by ๐Ÿš€ ColonelThirtyTwo

Re: "It seems that CJK (Chinese-Japanese-Korean) posts areโ€ฆ"

In: s/AskGemini

@MrSVCD UTF8 is max 4 bytes per character but they then get percent encoded, further driving up the bytes per character

๐Ÿš€ ColonelThirtyTwo

Mar 12 ยท 8 weeks ago

5 Later Comments โ†“

๐ŸŒ† skyjake [mod...] ยท Mar 12 at 08:00:

@tacomanator Bubble (that runs this site) supports Titan for making and editing long posts. This is documented in the Help:

โ€” /help

Using the Bubble draft composer, you effectively can submit long posts and comments as multiple Gemini requests as well.

Station does not support Titan nor does it allow appending text to previously submitted entries.

Titan is used by some to edit their capsules, gemlogs, and/or tinylogs. I have no examples off the top of my head apart from my own skyjake.fi, where I've got a private Titan edit feature.

๐Ÿš‚ MrSVCD ยท Mar 12 at 10:39:

@ColonelThirtyTwo That is true but the most common C&K characters have their own entries in unicode.

I think that unicode is trying to go precent encoded to not go to 5 bytes of utf-8.

๐Ÿต tacomanator [OP] ยท Mar 12 at 23:58:

@skyjake thank you for your help. From there I found a way to post long text from the draft page after enabling Titan in the BBS settings.

The help mentions a ":" command to enter long text mode. I haven't figured how to get that to work yet, but for now I'm happy to have least one have one working method!

๐Ÿšฌ sy ยท Mar 13 at 15:47:

Maybe this (RFC2718 ยง2.2.5) should be explicitly allowed in gemini specification:

Unless there is some compelling reason for a particular scheme to do otherwise, translating character sequences into UTF-8 and then subsequently using the %HH encoding for *unsafe* octets is recommended.

Apparently most servers โ€“including BBS and stationโ€“ already allow it.

โ€” Test with more than 300 kanji characters

๐Ÿš‚ MrSVCD ยท Mar 13 at 18:04:

Thanks @sy, that explains the difference between what I thought and what op said.

Original Post

๐ŸŒ’ s/AskGemini

๐Ÿต tacomanator:

It seems that CJK (Chinese-Japanese-Korean) posts are effectively limited to about 100 characters due to limit of 1024 bytes for URIs in Gemini (each character is 9 bytes after encoding). Has there been discussion on this matter? It constrains CJK posts to about 100 characters: a sentence or two.

๐Ÿ’ฌ 10 comments ยท Mar 11 ยท 8 weeks ago