👽 tacomanator

As far as I can tell, CJK logs are effectively confined to about 100 characters—maybe a sentence or two–due to 1024 byte limit on Gemini URIs (encoded CJK characters are 9 bytes). Is this just "the way it is?"

Original log did not show in feed for some reason, so this is a re-log.

1 month ago

Actions

👋 Join Station

5 Replies

👽 tacomanator

@sy thank you!

I since learned about Titan, but it’s not supported here so ability to skip encoding will be helpful.

Now to find the setting… · 1 month ago

👽 sy

@tacomanator you can send them un-encoded, if your client supports so. Below, there are 311 chars. · 1 month ago

👽 sy

日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日日 · 1 month ago

👽 tacomanator

@tm85 yep, 9 bytes. For example Japanese 日 in UTF-8 is E6 97 A5, which becomes %E6%97%A5 in the URL (9 characters).

If your client shows the byte remaining count, you can verify this emperically by watching it go down by 9 with every CJK character. Or emoji like 😥, which are 4 bytes unicode ergo 12 bytes URL encoded. · 1 month ago

👽 tm85

9 bytes??? are you sure??? Is this not UTF-8? The widest characters should be 4 · 1 month ago