<!-- markdown css tag --><div class="pinggu_markdown">
<div class="pinggu_markdown__html"><h1 id="第一章:用python绘制柱状图,对比商品销售额的差异">第一章:用Python绘制柱状图,对比商品销售额的差异</h1>
<p>本文分享知识:</p>
<blockquote>
<ol>
<li>os 模块获取上一级目录的绝对地址</li>
<li>pands 读取 sqlite3 数据库中的数据</li>
<li>将 pandas 中显示为科学计数法的数据修改为正常显示</li>
<li>pandas 对一列数据求最大值、最小值、求和操作</li>
<li>pyecharts 绘制柱状图</li>
</ol>
</blockquote>
<p><strong>关注微信公众号《帅帅的Python》,后台回复《数据分析》获取数据及源码</strong><br>
<img src="https://gitee.com/myrensheng/data_analysis/raw/master/images/sspython.png" alt="帅帅的Python"></p>
<h2 id="项目背景">项目背景</h2>
<p>“小凡,上个月的商品销售数据你简单分析一下,等会发我?” 经理走到小凡身边问到</p>
<p>“好的,我下午发您”,小凡一边说着,一边打开 jupyter 开始操作</p>
<h2 id="读取数据">读取数据</h2>
<p>小凡常用的数据分析工具是 pandas、numpy ,连接数据库常用的工具是 sqlalchemy</p>
<pre class=" language-python"><code class="prism language-python"><span class="token keyword">import</span> pandas <span class="token keyword">as</span> pd
<span class="token keyword">import</span> numpy <span class="token keyword">as</span> np
<span class="token keyword">from</span> sqlalchemy <span class="token keyword">import</span> create_engine
<span class="token keyword">import</span> os
</code></pre>
<p>数据放在上一级的目录下名为 data.db 的文件</p>
<pre class=" language-python"><code class="prism language-python"><span class="token comment"># 数据库地址:数据库放在上一级目录下</span>
db_path <span class="token operator">=</span> os<span class="token punctuation">.</span>path<span class="token punctuation">.</span>join<span class="token punctuation">(</span>os<span class="token punctuation">.</span>path<span class="token punctuation">.</span>dirname<span class="token punctuation">(</span>os<span class="token punctuation">.</span>getcwd<span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token string">"data.db"</span><span class="token punctuation">)</span>
engine_path <span class="token operator">=</span> <span class="token string">"sqlite:///"</span><span class="token operator">+</span>db_path
<span class="token comment"># 获取数据函数,根据输入的SQL语句返回 Datafr ame 类型数据</span>
<span class="token keyword">def</span> <span class="token function">link_sqlite</span><span class="token punctuation">(</span>sql<span class="token punctuation">)</span><span class="token punctuation">:</span>
engine <span class="token operator">=</span> create_engine<span class="token punctuation">(</span>engine_path<span class="token punctuation">)</span>
df <span class="token operator">=</span> pd<span class="token punctuation">.</span>read_sql<span class="token punctuation">(</span>sql<span class="token punctuation">,</span>con<span class="token operator">=</span>engine<span class="token punctuation">)</span>
<span class="token keyword">return</span> df
sql <span class="token operator">=</span> <span class="token string">"select * from salesSummary"</span>
df <span class="token operator">=</span> link_sqlite<span class="token punctuation">(</span>sql<span class="token punctuation">)</span>
</code></pre>
<h2 id="数据描述">数据描述</h2>
<pre class=" language-python"><code class="prism language-python">df<span class="token punctuation">.</span>shape
<span class="token comment"># (30, 3)</span>
</code></pre>
<p>数据维度为 30 行 3 列,我们随机查看5条数据:</p>
<pre class=" language-python"><code class="prism language-python">df<span class="token punctuation">.</span>sample<span class="token punctuation">(</span><span class="token number">5</span><span class="token punctuation">)</span>
</code></pre>
<p><img src="./%E5%9B%BE%E7%89%87/1.png" alt=""></p>
<p>该数据描述的是30个商品上个月在淘宝、京东的销售额(元)情况,简单看一下数据的分布情况:</p>
<pre class=" language-python"><code class="prism language-python"><span class="token comment"># 数据分布</span>
df<span class="token punctuation">.</span>describe<span class="token punctuation">(</span><span class="token punctuation">)</span>
</code></pre>
<p><img src="./%E5%9B%BE%E7%89%87/2.png" alt=""></p>
<p>该方法返回的是数据的数量、均值、最大值、最小值、分位数信息,但是 pandas 用科学记数法显示的数据,无法直观看到数据,通过下面方法可以设置修改 pandas 的默认显示方式</p>
<pre class=" language-python"><code class="prism language-python">np<span class="token punctuation">.</span>set_printoptions<span class="token punctuation">(</span>suppress<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">)</span>
pd<span class="token punctuation">.</span>set_option<span class="token punctuation">(</span><span class="token string">"display.float_format"</span><span class="token punctuation">,</span><span class="token keyword">lambda</span> x<span class="token punctuation">:</span><span class="token string">"%.f"</span><span class="token operator">%</span>x<span class="token punctuation">)</span>
df<span class="token punctuation">.</span>describe<span class="token punctuation">(</span><span class="token punctuation">)</span>
</code></pre>
<p><img src="./%E5%9B%BE%E7%89%87/3.png" alt=""></p>
<p>统计一下每个商品的总销售额、每个店铺的最大、最小销售额数据</p>
<pre class=" language-python"><code class="prism language-python"><span class="token comment"># 添加一列总销售额</span>
df<span class="token punctuation">[</span><span class="token string">"总销售额"</span><span class="token punctuation">]</span> <span class="token operator">=</span> df<span class="token punctuation">[</span><span class="token string">"淘宝"</span><span class="token punctuation">]</span><span class="token operator">+</span>df<span class="token punctuation">[</span><span class="token string">"京东"</span><span class="token punctuation">]</span>
<span class="token comment"># 淘宝店总销售额</span>
df<span class="token punctuation">[</span><span class="token string">"淘宝"</span><span class="token punctuation">]</span><span class="token punctuation">.</span><span class="token builtin">sum</span><span class="token punctuation">(</span><span class="token punctuation">)</span>
<span class="token comment"># 671058957</span>
<span class="token comment"># 京东店总销售额</span>
df<span class="token punctuation">[</span><span class="token string">"京东"</span><span class="token punctuation">]</span><span class="token punctuation">.</span><span class="token builtin">sum</span><span class="token punctuation">(</span><span class="token punctuation">)</span>
<span class="token comment"># 639936097</span>
</code></pre>
<pre class=" language-python"><code class="prism language-python"><span class="token comment"># 销售额最高的商品</span>
df<span class="token punctuation">[</span>df<span class="token punctuation">[</span><span class="token string">"总销售额"</span><span class="token punctuation">]</span><span class="token operator">==</span>df<span class="token punctuation">[</span><span class="token string">"总销售额"</span><span class="token punctuation">]</span><span class="token punctuation">.</span><span class="token builtin">max</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">]</span>
</code></pre>
<p><img src="./%E5%9B%BE%E7%89%87/4.png" alt=""></p>
<pre class=" language-python"><code class="prism language-python"><span class="token comment"># 销售额最低的商品</span>
df<span class="token punctuation">[</span>df<span class="token punctuation">[</span><span class="token string">"总销售额"</span><span class="token punctuation">]</span><span class="token operator">==</span>df<span class="token punctuation">[</span><span class="token string">"总销售额"</span><span class="token punctuation">]</span><span class="token punctuation">.</span><span class="token builtin">min</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">]</span>
</code></pre>
<p><img src="./%E5%9B%BE%E7%89%87/5.png" alt=""></p>
<h2 id="柱状图对比">柱状图对比</h2>
<p>小凡常用的绘图工具为 pyecharts ,用下面代码绘制出商品在淘宝、京东的销售柱状图</p>
<pre class=" language-python"><code class="prism language-python"><span class="token keyword">from</span> pyecharts <span class="token keyword">import</span> options <span class="token keyword">as</span> opts
<span class="token keyword">from</span> pyecharts<span class="token punctuation">.</span>charts <span class="token keyword">import</span> Bar
x_names <span class="token operator">=</span> df<span class="token punctuation">[</span><span class="token string">"商品"</span><span class="token punctuation">]</span><span class="token punctuation">.</span>tolist<span class="token punctuation">(</span><span class="token punctuation">)</span>
tao_bao <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token builtin">round</span><span class="token punctuation">(</span>x<span class="token operator">/</span><span class="token number">10000</span><span class="token punctuation">,</span><span class="token number">2</span><span class="token punctuation">)</span> <span class="token keyword">for</span> x <span class="token keyword">in</span> df<span class="token punctuation">[</span><span class="token string">"淘宝"</span><span class="token punctuation">]</span><span class="token punctuation">.</span>tolist<span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">]</span>
jing_dong <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token builtin">round</span><span class="token punctuation">(</span>x<span class="token operator">/</span><span class="token number">10000</span><span class="token punctuation">,</span><span class="token number">2</span><span class="token punctuation">)</span> <span class="token keyword">for</span> x <span class="token keyword">in</span> df<span class="token punctuation">[</span><span class="token string">"京东"</span><span class="token punctuation">]</span><span class="token punctuation">.</span>tolist<span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">]</span>
c <span class="token operator">=</span> <span class="token punctuation">(</span>
Bar<span class="token punctuation">(</span><span class="token punctuation">)</span>
<span class="token punctuation">.</span>add_xaxis<span class="token punctuation">(</span>x_names<span class="token punctuation">)</span>
<span class="token punctuation">.</span>add_yaxis<span class="token punctuation">(</span><span class="token string">"淘宝"</span><span class="token punctuation">,</span> tao_bao<span class="token punctuation">)</span>
<span class="token punctuation">.</span>add_yaxis<span class="token punctuation">(</span><span class="token string">"京东"</span><span class="token punctuation">,</span> jing_dong<span class="token punctuation">)</span>
<span class="token punctuation">.</span>set_global_opts<span class="token punctuation">(</span>
xaxis_opts<span class="token operator">=</span>opts<span class="token punctuation">.</span>AxisOpts<span class="token punctuation">(</span>axislabel_opts<span class="token operator">=</span>opts<span class="token punctuation">.</span>LabelOpts<span class="token punctuation">(</span>rotate<span class="token operator">=</span><span class="token operator">-</span><span class="token number">15</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
title_opts<span class="token operator">=</span>opts<span class="token punctuation">.</span>TitleOpts<span class="token punctuation">(</span>title<span class="token operator">=</span><span class="token string">"商品销售额分析"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
<span class="token punctuation">)</span>
<span class="token punctuation">.</span>set_series_opts<span class="token punctuation">(</span>
label_opts<span class="token operator">=</span>opts<span class="token punctuation">.</span>LabelOpts<span class="token punctuation">(</span>is_show<span class="token operator">=</span><span class="token boolean">False</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
markpoint_opts<span class="token operator">=</span>opts<span class="token punctuation">.</span>MarkPointOpts<span class="token punctuation">(</span>
data<span class="token operator">=</span><span class="token punctuation">[</span>
opts<span class="token punctuation">.</span>MarkPointItem<span class="token punctuation">(</span>type_<span class="token operator">=</span><span class="token string">"max"</span><span class="token punctuation">,</span> name<span class="token operator">=</span><span class="token string">"最大值"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
opts<span class="token punctuation">.</span>MarkPointItem<span class="token punctuation">(</span>type_<span class="token operator">=</span><span class="token string">"min"</span><span class="token punctuation">,</span> name<span class="token operator">=</span><span class="token string">"最小值"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
<span class="token punctuation">]</span>
<span class="token punctuation">)</span><span class="token punctuation">,</span>
markline_opts<span class="token operator">=</span>opts<span class="token punctuation">.</span>MarkLineOpts<span class="token punctuation">(</span>
data<span class="token operator">=</span><span class="token punctuation">[</span>
opts<span class="token punctuation">.</span>MarkLineItem<span class="token punctuation">(</span>type_<span class="token operator">=</span><span class="token string">"average"</span><span class="token punctuation">,</span> name<span class="token operator">=</span><span class="token string">"平均值"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
<span class="token punctuation">]</span>
<span class="token punctuation">)</span><span class="token punctuation">,</span>
<span class="token punctuation">)</span>
<span class="token punctuation">)</span>
c<span class="token punctuation">.</span>render_notebook<span class="token punctuation">(</span><span class="token punctuation">)</span>
</code></pre>
<p><img src="./%E5%9B%BE%E7%89%87/6.png" alt=""></p>
<h2 id="分析报告">分析报告</h2>
<p>上个月销售的30件商品中:</p>
<blockquote>
<p>淘宝店总销售额为:6亿7千万;<br>
京东店总销售额为:6以4千万;<br>
最高的销售额商品为:商品19,总销售额为1亿;<br>
最低的销售额商品为:商品26,总销售额为3百万;</p>
</blockquote>
<p>小凡将数据分析的结果整理好制作成 word 文档发给经理。</p>
<p>过了一会儿,经理找到小凡,让小凡帮忙预测一下,下个月商品的销售额是多少?给出一个大概的区间即可。</p>
</div>
</div>