#pip install 模块
wordcloud
matplotlib #数据可视化
jieba #分词库
pillow
numpy
可能出现的问题:
wordcloud安装需要visual C++14.0 whl安装
wordcloud()参数解释:
参数名 | 含义 |
---|---|
font_path | 可用于指定字体路径,包括otf和ttf |
width | 词云的宽度,默认为400 |
height | 词云的高度,默认为200 |
mask | 蒙版,可用于定制词云的形状 |
min_font_size | 最小字号,默认为4 |
max_font_size | 最大字号,默认为词云的高度 |
max_words | 次的最大数量,默认为200 |
stopwords | 将被忽略的停用词,如果不指定则使用默认的停用词词库 |
background_color | 背景颜色,默认为black |
mode | 默认为RGB模式,如果为RGBA模式且background_color设为None,则背景将透明 |
1、可以在电脑自带的字体中选择,复制到运行脚本路径下,也可指定路径;
C:\Windows\Fonts
2、下载其它喜欢的字体放到指定路径下;
注:这里提供了一些字体,可自行下载;
中文:
与你交流,就像与自己的灵魂在耳语,自由,让心跳的声音呼之欲出,真想伸出双臂与你进行一次心贴心的拥抱窗外寂静无声。唯独冬雨拍打台阶的声音犹如琴音围炉,已经是梦想中的奢望甚至无须。有金骏眉陪,足矣浪漫隐藏在烟圈之中,以一个思想者的姿态静坐,思绪感觉已经和你水乳交融。唇齿生香不仅仅来源于杯中的香茗仿佛是等待一场期待已久的约会琐碎的日子也被润泽得云蒸霞蔚实际上,你读着我的书心迷神醉你的灵魂已经变成奔向我的异乡人我是终究要牵你手的。这也许就是江湖。温馨,让忧伤与我无缘我也不会让疼痛,在你心里暗藏一场雪已经送来彼此的关切
从洼得不能再洼的地方起步。像一个坐标轴上的负数,更像一块会走动的棱角分明的石头。生活是刀,削铁如泥它切割着我,我磨砺着它。终于有一天我走到了坐标轴上的零点。
我还是没有圆滑。而生活也没有更柔软。我把目光,投向看不见的远方。我把背影,留给回不去的故乡。铿锵的步履犹如骏马的蹄声一度成为故乡的传奇。
而正道,从来就不是一条直线。九曲十八弯的路上。我的泪水可以被路边一朵小花引出。泰山压顶却改变不了我平视的眼神。妥协深藏我的脊梁怎么也弯不下来。这就是我半生的经历。我已经没有余力去改变。夕阳晚照里唯一的预见就是回归。回归成那个洼得不能再洼的地方的一块石头这些,你不知道。你不知道,我不怪你。
英文:
Three passions, simple but overwhelmingly strong, have governed my life: the longing for love, the search for knowledge,
and unbearable pity for the suffering of mankind. These passions, like great winds, have blown me hither and thither, in a
wayward course, over a deep ocean of anguish, reaching to the verge of despair. I have sought love, first, because it brings
ecstasy --- ecstasy so great that I would have sacrificed all the rest of life for a few hours of this joy. I have sought it, next,
because it relieves loneliness --- that terrible loneliness in which one shivering consciousness looks over the rim of the world
into cold unfathomable lifeless abyss. I have sought it, finally, because in the union of love I have seen, in a mystic miniature,
the prefiguring vision of the heaven that saints and poets have imagined. This is what I sought, and though it might seem
too good for human life, this is what --- at last --- I have found. With equal passion I have sought knowledge. I have wished
to understand the hearts of men, I have wished to know why the stars shine. And I have tried to apprehend the Pythagorean
power by which number holds away above the flux. A little of this, but not much, I have achieved. Love and knowledge, so
far as they were possible, led upward toward the heavens. But always pity brought me back to earth. Echoes of cries of pain
reverberated in my heart. Children in famine, victims tortured by oppressors, helpless old people a hated burden to their sons,
and the whole world of loneliness, poverty, and pain make a mockery of what human life should be. I long to alleviate the evil,
but I cannot, and I too suffer. This has been my life. I have found it worth living, and I would gladly live it again if the chance
were offered to me.
效果如下:
代码如下:
##chinese
from wordcloud import WordCloud
import matplotlib.pyplot as plt
# 打开文本
text = open("bulletchinese.txt", encoding="utf-8").read()
# 生成对象
wc = WordCloud(font_path="msyh.ttc", width=800, height=600, mode="RGBA", background_color=None).generate(text)
# 显示词云
plt.imshow(wc, interpolation='bilinear')
plt.axis("off")
plt.show()
# 保存到文件
wc.to_file("bulletchinese.png")
效果如下:
代码如下
from wordcloud import WordCloud
import matplotlib.pyplot as plt
# 打开文本
text = open("bulletEnglish.txt", encoding="utf-8").read()
# print(text)
# print(type(text)) # <class 'str'>
# 生成对象
#wc = WordCloud().generate(text)
wc = WordCloud(font_path="msyh.ttc", width=800, height=600, mode="RGBA", background_color=None).generate(text)
# 显示词云
plt.imshow(wc, interpolation='bilinear') # interpolation设置插值,设置颜色、排列等
plt.axis("off") # 关闭坐标轴
plt.show()
# 将词云图片保存到文件
wc.to_file("bulletEnglish.png")
效果如下:
代码如下:
from wordcloud import WordCloud
import matplotlib.pyplot as plt
import jieba
# 打开文本
text = open("bulletchinese.txt", encoding="utf-8").read()
# 中文分词
text = ' '.join(jieba.cut(text)) # 利用jieba进行分词形成列表,将列表里面的词用空格分开并拼成长字符串。
print(text[:10000]) # 打印前100个字符
# 生成对象
wc = WordCloud(font_path="msyh.ttc", width=800, height=600, mode="RGBA", background_color=None).generate(text)
# 显示词云
plt.imshow(wc, interpolation="bilinear")
plt.axis("off")
plt.show()
# 保存到文件
wc.to_file("bulletchinese2.png")
效果如下:
代码如下:
from wordcloud import WordCloud
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
import jieba
# 打开文件
text = open("bulletchinese.txt", encoding="utf-8").read()
# 中文分词
text = ' '.join(jieba.cut(text))
print(text[:1000])
# 生成对象
mask = np.array(Image.open("background.png")) # 使用蒙版图片
wc = WordCloud(mask=mask, font_path="msyh.ttc", width=800, height=600, mode="RGBA", background_color=None).generate(text)
# 显示词云
plt.imshow(wc, interpolation="bilinear")
plt.axis("off")
plt.show()
#保存文件
wc.to_file("bulletchinese3.png")
效果如下:
代码如下:
from wordcloud import WordCloud, ImageColorGenerator
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
import jieba
# 打开文件
text = open("bulletchinese.txt", encoding="utf-8").read()
# 中文分词
text = ' '.join(jieba.cut(text))
print(text[:1000])
# 生成对象
mask = np.array(Image.open("background1.png"))
wc = WordCloud(mask=mask, font_path="msyh.ttc", width=1000, height=800,mode="RGBA", background_color=None).generate(text)
# 从图片中生成颜色
image_colors = ImageColorGenerator(mask)
wc.recolor(color_func=image_colors)
# 显示词云
plt.imshow(wc, interpolation="bilinear")
plt.axis("off")
plt.show()
# 保存文件
wc.to_file("bulletchinese4.png")
效果如下:
代码如下:
from wordcloud import WordCloud
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
import random
import jieba
# 打开文本
text = open("bulletchinese.txt", encoding="utf-8").read()
# 中文分词
text = ' '.join(jieba.cut(text))
print(text[:1000])
# 颜色函数:(词,词的大小,词的位置,词的朝向,词的路径,随机的状态)
def random_color(word, font_size, position, orientation, font_path, random_state):
s = 'hsl(0, %d%%, %d%%)' % (random.randint(60, 80), random.randint(60, 80))
# hsl代表色相(Hue)、饱和度(saturation)、亮度(luminance)
print(s)
return s
# 生成对象
mask = np.array(Image.open("background.png"))
wc = WordCloud(color_func=random_color, mask=mask, font_path="msyh.ttc", mode="RGBA", background_color=None).generate(
text) # color_func=random_color即用函数来指定词的颜色
# 显示词云
plt.imshow(wc, interpolation="bilinear")
plt.axis("off")
plt.show()
# 保存到文件
wc.to_file("bulletchinese5.png")
效果如下:
代码如下:
from wordcloud import WordCloud, ImageColorGenerator
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
import jieba.analyse # 用于提取关键词
# 打开文本
text = open("bulletchinese.txt", encoding="utf-8").read()
# 提取关键词和权重
freq = jieba.analyse.extract_tags(text, topK=200, withWeight=True) # 提取文件中的关键词,topK表示提取的数量,withWeight=True表示会返回关键词的权重。
print(freq[:200])
freq = {i[0]: i[1] for i in freq} # 字典
# 生成对象
mask = np.array(Image.open("background2.png"))
wc = WordCloud(mask=mask, font_path="msyh.ttc", mode="RGBA", background_color=None).generate_from_frequencies(freq)
# 从图片中生成颜色
image_colors = ImageColorGenerator(mask)
wc.recolor(color_func=image_colors)
# 显示词云
plt.imshow(wc, interpolation="bilinear")
plt.axis("off")
plt.show()
# 保存文件
wc.to_file("bulletchinese6.png")
分享:
在孤独中有时饱含着屈辱和忍耐,又酝酿着愤怒和抗争。此时的沉寂往往蕴藏着强大的爆发力。生命中最孤独的时刻,往往会成为人生的催化剂。