python - Plotting flaws in Matplotlib -
scatter plot using x,y coordinates proposes plots in matplotlib differ obtained using other programs. example, here results of pca on 2 fit score. same graph using r , same data provides different display…i checked excell , libreoffice : provided same display r. before roaring against matplotlib or report bug, other opinions , check if did things well. flaws?
i checked floats not problem, checked coordinates order similarly,… plot r:
mydata = read.csv("c:/users/anon/desktop/data.txt") # read csv file summary(mydata) attach(mydata) plot(mydata)
scatter plot made r
same data plotted matplotlib:
import matplotlib.pyplot mpl import numpy np import os # open file pca results , convert float file_data = os.getcwd() + "\\data.txt" f = open(file_data, 'r') data=f.readlines() f.close() x in range(len(data)) : = data[x] b = a.split(',') data[x] = b in xrange(len(data)): j in xrange(len(data[i])): data[i][j] = float(data[i][j]) print data[0] x_train = np.mat(data) print "x_train\n",x_train mpl.scatter(x_train[:, 0], x_train[:, 1], c='white') mpl.show()
, results of printing x_train (so can verify data same) excell:
data: (i cannot put data, please tell me how join *.txt file ~40.5 ko)
0.02753547770433 -0.037999362802379 0.05179194064903 0.0257492713593311 -0.0272928319004863 0.0065143681863637 0.0891355504379135 -0.00801696955147688 0.0946809371499167 -0.00502202338807476 -0.0445799941736001 -0.0435759273767196 -0.333617999778119 -0.204222004815357 -0.127212025425053 -0.110264460064754 -0.0243459270896855 -0.0622273166478512 0.0497080821876597 0.0272080474151131 -0.181221703468915 -0.134945934382777 -0.0699503258694739 -0.0835239795690277
edit: yet exported pca data (from scipy) text file , opened common text file python/matplotlib , r avoid prblms related pca. plots made after handling (and graph before pca looks dome)
edit2: using numpy.loadtxt(), displays r custom method , numpy.loadtxt() provided same data shape, size, type , values, what's mechanism involved?
x_train numpy.loadtxt() [[ 0.02753548 -0.03799936] [ 0.05179194 0.02574927] [-0.02729283 0.00651437] ..., [ 0.02670961 -0.00696177] [ 0.09011859 -0.00661216] [-0.04406559 0.09285291]] shape , size (1039l, 2l) 2078 x_train custom-method [[ 0.02753548 -0.03799936] [ 0.05179194 0.02574927] [-0.02729283 0.00651437] ..., [ 0.02670961 -0.00696177] [ 0.09011859 -0.00661216] [-0.04406559 0.09285291]] shape , size (1039l, 2l) 2078
the problem representing x_train
matrix rather 2-dimensional array. means when subset x_train[:, 0]
, aren't getting 1-dimensional array- getting matrix 1 column (which matplotlib tries scatter). can see printing x_train[:, 0]
.*
you can fix problem changing line:
x_train = np.mat(data)
to
x_train = np.array(data)
*for example, on data posted, x_train[:, 0]
is:
[[ 0.02753548] [ 0.05179194] [-0.02729283] [ 0.08913555] [ 0.09468094] [-0.04457999] [-0.333618 ] [-0.12721203] [-0.02434593] [ 0.04970808] [-0.1812217 ] [-0.06995033]]
Comments
Post a Comment