python 2.7 - Is there a Javascript equivalent to urllib.quote and urllib.unquote? -
according answer given identical question several years ago, encodeuricomponent(str) in javascript should equivalent urllib.quote(str, safe='~()*!.\'') in python. extension, have guessed decodeuricomponent(str) equivalent urllib.unquote(str).
this not case experience. i'm writing networking code communicate python server client on website , i'm getting different results.
i'm generating unique id , sending on tcp using identical following code:
import urllib import struct import random def sendid(): id = random.systemrandom().getrandbits(128) upper = id >> 64 lower = id & 0xffffffffffffffff packed = struct.pack('<b2q', 0x00, upper, lower) encoded = urllib.quote(packed, safe='~()*!.\'') # below line sending on established tcp connection # code irrelevant working expected sendtoclient(encoded)
the message received on clientside in following websocket object callback:
this.websocket.onmessage = function (msg) { console.log(msg.data); var stype = bufferpack.unpack('<b', decodeuricomponent(msg.data).substring(0, 1)); console.log(stype); };
this should decode msg.data string , set stype first 'part' of packed data (in case 0x00).
the problem i'm encountering these functions not working how expected. after testing in jsfiddle , python command line, getting different results encode/decodeuricomponent , urllib.quote/unquote functions. encodeuricomponent gives me different result 'equivalent' urllib.quote, , decodeuricomponent results in malformed uri error.
this can seen in sample shown below:
>>> import random >>> import urllib >>> import struct >>> id = random.systemrandom().getrandbits(128) >>> upper = id >> 64 >>> lower = id & 0xffffffffffffffff >>> packed = struct.pack('<b2q', 0x00, upper, lower) >>> encoded = urllib.quote(packed, safe='~()*!.\'') >>> id 79837607446780471980532690349264559028l >>> upper 4328005371992213727l >>> lower 4092443888854326196l >>> packed '\x00\xdf\x08\x94\x7f\xf4)\x10<\xb4[a\xc2\x08h\xcb8' >>> encoded '%00%df%08%94%7f%f4)%10%3c%b4%5ba%c2%08h%cb8'
however when use encodeuricomponent , decodeuricomponent on 'packed' , 'encoded' respectively different encoded value , decoding throws error. javascript followed output shown below.
console.log(encodeuricomponent('\x00\xdf\x08\x94\x7f\xf4)\x10<\xb4[a\xc2\x08h\xcb8')) console.log(decodeuricomponent('%00%df%08%94%7f%f4)%10%3c%b4%5ba%c2%08h%cb8'));
%00%c3%9f%08%c2%94%7f%c3%b4)%10%3c%c2%b4%5ba%c3%82%08h%c3%8b8 (index):50 uncaught urierror: uri malformed
jsfiddle snippet above javascript code convenience.
so finally, actual question: functions used above (quote/unquote , encode/decodeuricomponent) equivalent? if not can suggest code changes or other libraries/functions i'm expecting (the encoded/decoded , packed/unpacked value being same on both client , server side)?
after playing around more example code , reading other resources similar issues found 'packed' string encoded using 'latin-1' character set , urllib.quote not working that.
below i've included same example python interpreter few lines showing proper encoding functions urllib.quote/unquote , encode/decodeuricomponent in fact equivalent when dealing utf-8.
>>> import random >>> import urllib >>> import struct >>> id = random.systemrandom().getrandbits(128) >>> upper = id >> 64 >>> lower = id & 0xffffffffffffffff >>> packed = struct.pack('<b2q', 0x00, upper, lower) >>> encoded = urllib.quote(packed, safe='~()*!.\'') >>> id 79837607446780471980532690349264559028l >>> upper 4328005371992213727l >>> lower 4092443888854326196l >>> packed '\x00\xdf\x08\x94\x7f\xf4)\x10<\xb4[a\xc2\x08h\xcb8' >>> encoded '%00%df%08%94%7f%f4)%10%3c%b4%5ba%c2%08h%cb8' >>> packed.decode('latin-1') u'\x00\xdf\x08\x94\x7f\xf4)\x10<\xb4[a\xc2\x08h\xcb8' >>> packed.decode('latin-1').encode('utf-8') '\x00\xc3\x9f\x08\xc2\x94\x7f\xc3\xb4)\x10<\xc2\xb4[a\xc3\x82\x08h\xc3\x8b8' >>> urllib.quote(packed.decode('latin-1').encode('utf-8'), safe='~()*!.\'') '%00%c3%9f%08%c2%94%7f%c3%b4)%10%3c%c2%b4%5ba%c3%82%08h%c3%8b8'
the output
'%00%c3%9f%08%c2%94%7f%c3%b4)%10%3c%c2%b4%5ba%c3%82%08h%c3%8b8'
matches output by
encodeuricomponent('\x00\xdf\x08\x94\x7f\xf4)\x10<\xb4[a\xc2\x08h\xcb8')
in javascript.
Comments
Post a Comment