python - How to replace string multiple times? -


i have 10 000 lines source code tons of duplication. read in file text.

example:

    assert pyarray_type(real0) == np.npy_double, "real0 not double"     assert real0.ndim == 1, "real0 has wrong dimensions"     if not (pyarray_flags(real0) & np.npy_c_contiguous):         real0 = pyarray_getcontiguous(real0)     real0_data = <double*>real0.data 

i want replace occurances of pattern with

    real0_data = _get_data(real0, "real0") 

where real0 can variable name [a-z0-9]+


so don't confused source code. code doesn't matter, text processing , regex.

this have far:

     path = "func.pyx"     source_string = open(path,"r").read()      pattern = r"""     assert pyarray_type\(([a-z0-9]+)\) == np.npy_double, "([a-z0-9]+) not double"     assert ([a-z0-9]+).ndim == 1, "([a-z0-9]+) has wrong dimensions"     if not (pyarray_flags(([a-z0-9]+)) & np.npy_c_contiguous):        ([a-z0-9]+) = pyarray_getcontiguous(([a-z0-9]+))     ([a-z0-9]+)_data = ([a-z0-9]+).data""" 

    

you can in text editor supports multiline regular expression search , replace.

i used komodo ide test this, because includes excellent regular expression tester ("rx toolkit") experimenting regular expressions. think there online tools this. same regular expression works in free komodo edit. should work in other editors support perl-compatible regular expressions.

in komodo, used replace dialog regex option checked, find:

assert pyarray_type\((\w+)\) == np\.npy_double, "\1 not double"\s*\n\s*assert \1\.ndim == 1, "\1 has wrong dimensions"\s*\n\s*if not \(pyarray_flags\(\1\) & np\.npy_c_contiguous\):\s*\n\s*\1 = pyarray_getcontiguous\(\1\)\s*\n\s*\1_data = <double\*>\1\.data 

and replace with:

\1_data = _get_data(\1, "\1") 

given test code:

    assert pyarray_type(real0) == np.npy_double, "real0 not double"     assert real0.ndim == 1, "real0 has wrong dimensions"     if not (pyarray_flags(real0) & np.npy_c_contiguous):         real0 = pyarray_getcontiguous(real0)     real0_data = <double*>real0.data      assert pyarray_type(real1) == np.npy_double, "real1 not double"     assert real1.ndim == 1, "real1 has wrong dimensions"     if not (pyarray_flags(real1) & np.npy_c_contiguous):         real1 = pyarray_getcontiguous(real1)     real1_data = <double*>real1.data      assert pyarray_type(real2) == np.npy_double, "real2 not double"     assert real2.ndim == 1, "real2 has wrong dimensions"     if not (pyarray_flags(real2) & np.npy_c_contiguous):         real2 = pyarray_getcontiguous(real2)     real2_data = <double*>real2.data 

the result is:

    real0_data = _get_data(real0, "real0")      real1_data = _get_data(real1, "real1")      real2_data = _get_data(real2, "real2") 

so how did regular expression original code?

  1. prefix instances of (, ), ., , * \ escape them (an easy manual search , replace).
  2. replace first instance of real0 (\w+). matches , captures string of alphanumeric characters.
  3. replace remaining instances of real0 \1. matches text captured (\w+).
  4. replace each newline , leading space on next line \s*\n\s*. matches trailing space on line, plus newline, plus leading space on next line. way regular expression works regardless of nesting level of code it's matching.

finally, "replace" text uses \1 needs original captured text.

you of course use similar regular expression in python if want way. suggest using \w instead of [a-z0-9] make simpler. also, don't include newlines , leading spaces; instead use \s*\n\s* approach used instead of multiline string. way independent of nesting level mentioned above.


Comments

Popular posts from this blog

Why does Ruby on Rails generate add a blank line to the end of a file? -

keyboard - Smiles and long press feature in Android -

node.js - Bad Request - node js ajax post -