I am trying to build a Superset view based on my QuestDB data. I store some URLs that I need to translate to encoded values to create an image URL to a hosted service, so my URL needed to be encoded to make that request. I don’t want to store this in the DB as. an encoded URL if I can avoid it, because it takes more storage space (it’s a long URL), and because that URL can be used in several different contexts.
I don’t see a url_encode
function in QuestDB or the SQLAlchemy extension. Is there any other way you know to do this? Can I register a custom function or something? This seems like it would be really useful and low hanging fruit.
Hi @cardilloscreations ,
I don’t think we have a function for this. However, we can support you in contributing one.
We have a custom HTTP client (questdb/core/src/main/java/io/questdb/cutlass/http/client/HttpClient.java at 2d5aedf000299f358818de8a18963bb922434f20 · questdb/questdb · GitHub).
There is code to URL encode in there. This could be copied and repurposed in griffin
, the query engine. It has quite an accessible API.
Put simply, you would need a new VarcharFunction
with this signature: url_encode(Ø)
. See FunctionFactory
for an explanation of that signature - it means url_encode takes one varchar as an argument
. Refer to existing functions that act on strings/varchar for inspiration.
If you implement one in a fork, you can open a PR on the main repo and upstream your change so it comes into the next release.
Feel free to connect on Slack for support if needed 
P.S It’d be good to round-trip it and have a matching url_decode
function!
As a follow up, if you are storing URLs with common roots, you might want to split the urls into parts.
For example, if you had stuff like this:
https://community.questdb.com/t/url-encode-a-column-value/760
https://community.questdb.com/t/cannot-change-default-admin-username-and-password/756
https://community.questdb.com/t/problem-with-case-statement/757
You might want to split it into something like:
CREATE TABLE urls (
root SYMBOL,
thread VARCHAR
);
INSERT INTO urls (root, thread) VALUES
('https://community.questdb.com/t', '/url-encode-a-column-value/760'),
('https://community.questdb.com/t', '/cannot-change-default-admin-username-and-password/756'),
('https://community.questdb.com/t', '/problem-with-case-statement/757')
SYMBOL
types are dictionary encoded, so each of your ‘root’ entries will only take 4 bytes for each row, saving a lot of space.
We have text functions: Text functions | QuestDB
No url path joining functions as far as I know, but you could upstream this too.
Not a bad idea. For now, I just ended up storing them encoded so it will be easier for report builder queries and then programattic use can always decode if needed. After review, it only ends up being about 12-20 more bytes per record.