.net - How to implement string with 1 byte char (and save memory) -


how implement single byte based string?

application uses large list of words.
words come sql , varchar (single byte).
each word has in int32 id.
download words to:

dictionionary<int32,string>  

for performance.

problem dictionary gets large out of memory exception.
end splitting data.
app hits list hitting sql each request not option.
database active.
dynamically paging , out of dictionary not option - bound listview , virtualiztion works great.
words loaded @ night - user needs static list.
use words search , process other data don't process words.

since char thought implement single byte based word:

public class stringbyte1252 : object, icomparable, icomparable<stringbyte1252> {     static encoding win1252 = encoding.getencoding("windows-1252");      public int32 id { get; private set; }     public byte[] bytes { get; private set; }      public string value { { return win1252.getstring(bytes); } }     public int32 length { { return bytes.length; } }      public int compareto(object obj)     {         if (obj == null)         {             return 1;         }         stringbyte1252 other = obj stringbyte1252;         if (other == null)         {             throw new argumentexception("a stringbyte1252 object required comparison.", "obj");         }         return this.compareto(other);     }     public int compareto(stringbyte1252 other)     {         if (object.referenceequals(other, null))         {             return 1;         }         return string.compare(this.value, other.value, stringcomparison.ordinalignorecase);     }     public override bool equals(object obj)     {         //check null , compare run-time types.         if (obj == null || !(obj stringbyte1252)) return false;         stringbyte1252 item = (stringbyte1252)obj;         return (this.bytes == item.bytes);     }     public override int gethashcode() { return id; }      public stringbyte1252(int32 id, byte[] bytes) { id = id; bytes = bytes; }  } 

this above works not more memory efficient

dictionionary<int32,string> 

dictionary int16 based characters uses less memory.

where did go wrong?
byte array take more space sum of bytes?
there way achieve single byte string?

an array has approximately 50 bytes of overhead in 64-bit runtime. in 32-bit runtime, it's little less: perhaps 40 bytes. there's standard .net allocation overhead (24 bytes in 64-bit runtime), , there's metadata array: number of dimensions, length, etc. can't save memory using individual byte arrays store short strings.

one way allocate large array of bytes , store strings in array, utf-8 encoded. dictionary becomes dictionary<int,int>, value being index array.

i showed how in article reducing memory required strings. able save 50% on normal string allocation way. see article more detail.

another problem dictionary overhead 24 bytes per entry. that's pretty expensive if have whole bunch of small objects. might consider instead making list of structures, sorting id, , using binary search. it's not o(1) access dictionary gives you, user interface plenty fast enough. overhead 8 bytes per entry.

the struct like:

struct wordentry {     public readonly int id;     public readonly int indexintostringtable; } 

Comments

Popular posts from this blog

Why does Ruby on Rails generate add a blank line to the end of a file? -

keyboard - Smiles and long press feature in Android -

node.js - Bad Request - node js ajax post -