python - curve fitting by a sum of gaussian with scipy -


i'm doing bioinformatics , map small rna on mrna. have mapping coordinate of protein on each mrna , calculate relative distance between place protein bound mrna , site bound small rna.

i obtain following dataset :

dist    eff -69 3 -68 2 -67 1 -66 1 -60 1 -59 1 -58 1 -57 2 -56 1 -55 1 -54 1 -52 1 -50 2 -48 3 -47 1 -46 3 -45 1 -43 1 0   1 1   2 2   12 3   18 4   18 5   13 6   9 7   7 8   5 9   3 10  1 13  2 14  3 15  2 16  2 17  2 18  2 19  2 20  2 21  3 22  1 24  1 25  1 26  1 28  2 31  1 38  1 40  2 

when plot data, have 3 pics : 1 @ around 3 -4 1 around 20 , last 1 around -50.

i try cubic spline interpolation, does'nt work data.

my idea curve fitting sum of gaussians. example in case, estimate 3 gaussian curve @ point 5,20 , -50.

how can ?

i looked @ scipy.optimize.curve_fit(), how can fit curve @ precise intervalle ? how can add curve have 1 single curve ?

import numpy np import matplotlib.pyplot plt import scipy.stats import scipy.optimize  data = np.array([-69,3, -68, 2, -67, 1, -66, 1, -60, 1, -59, 1,                  -58, 1, -57, 2, -56, 1, -55, 1, -54, 1, -52, 1,                  -50, 2, -48, 3, -47, 1, -46, 3, -45, 1, -43, 1,                  0, 1, 1, 2, 2, 12, 3, 18, 4, 18, 5, 13, 6, 9,                  7, 7, 8, 5, 9, 3, 10, 1, 13, 2, 14, 3, 15, 2,                  16, 2, 17, 2, 18, 2, 19, 2, 20, 2, 21, 3, 22, 1,                  24, 1, 25, 1, 26, 1, 28, 2, 31, 1, 38, 1, 40, 2]) x, y = data.reshape(-1, 2).t  def tri_norm(x, *args):     m1, m2, m3, s1, s2, s3, k1, k2, k3 = args     ret = k1*scipy.stats.norm.pdf(x, loc=m1 ,scale=s1)     ret += k2*scipy.stats.norm.pdf(x, loc=m2 ,scale=s2)     ret += k3*scipy.stats.norm.pdf(x, loc=m3 ,scale=s3)     return ret   params = [-50, 3, 20, 1, 1, 1, 1, 1, 1]  fitted_params,_ = scipy.optimize.curve_fit(tri_norm,x, y, p0=params)  plt.plot(x, y, 'o') xx = np.linspace(np.min(x), np.max(x), 1000) plt.plot(xx, tri_norm(xx, *fitted_params)) plt.show() 

enter image description here

>>> fitted_params array([ -60.46845528,    3.801281  ,   13.66342073,   28.26485602,           1.63256981,   10.31905367,  110.51392765,   69.11867159,          63.2545624 ]) 

so can see idea of 3 peaked function doesn't agree real data.


Comments

Popular posts from this blog

Why does Ruby on Rails generate add a blank line to the end of a file? -

keyboard - Smiles and long press feature in Android -

node.js - Bad Request - node js ajax post -