• 短网址解析长网址python示例


    可视化比较麻烦我就没做,用文件处理的,这里需要两个文件

    1、readUrl.txt文件保存需要解析的字符串

    2、newUrl.txt文件保存解析完成的字符串

    目录

    readUrl.txt文件示例

    编码示例: 

    推荐获取网页URL的正则

    解析结果newUrl.txt


    readUrl.txt文件示例

    #接龙
    
    
    http://t.csdn.cn/DWodz
    
    1. CSDN-亮点 http://t.csdn.cn/DWodz
    2. 不知名白帽  http://t.csdn.cn/YO6Sm
    3. 编程爱好者-阿新 http://t.csdn.cn/4suuN
    4. 一一哥 https://yiyige.blog.csdn.net/article/details/120990448
    5. 执久 http://t.csdn.cn/4UCQf
    6. 花神庙码农@CSDN
    http://t.csdn.cn/t9moE
    7. 木木 http://t.csdn.cn/aalnU
    8. 挽·烽 http://t.csdn.cn/LaZIz
    高质量三连回访
    9. 六月暴雪飞梨花 http://t.csdn.cn/VqL0s
    10. 风铃听雨~ http://t.csdn.cn/9fkAT
    11. 东非不开森 http://t.csdn.cn/fZa8s 开学季征文 如有时间愿意看的,可以指点一下嘿嘿 谢谢啦🥰🥰
    12. 小明java问道之路 经验文 | 编程的上帝视角是什么?感兴趣的可以看看
    http://t.csdn.cn/ffDdq
    
    硬核深度文 | 精通内核-CPU控制并发原理CPU中断控制
    http://t.csdn.cn/UZ6kN
    
    💖在线求个一键三连💖
    13. AKA|布鲁克林欧神仙 https://blog.csdn.net/m0_54594153/article/details/126661839?spm=1001.2014.3001.5501高质量三连回访
    14. 阿提说说 http://t.csdn.cn/K3KSU
    15. DDD666🍭 http://t.csdn.cn/2zn4R
    16. 付文龙(爱吃回锅肉)红目香薰 http://t.csdn.cn/kqcPv
    17. Bourne http://t.csdn.cn/ndJvc
    18. 秦羽 http://t.csdn.cn/nn0cO
    19. 宁采桃花不采臣 http://t.csdn.cn/nqgEK
    2.​Code For Better
    20. CSDN-北极的三哈
    http://t.csdn.cn/Zn1WF
    21. promise https://blog.csdn.net/m0_71485750/article/details/126427221  互三互粉
    22. Beyond https://blog.csdn.net/chuxinchangcun/article/details/126681915

    编码示例: 

    1. import requests
    2. import re
    3. file = open("readUrl.txt", "r", encoding="utf-8")
    4. strListArr = file.readlines()
    5. strList = "".join(strListArr)
    6. file.close()
    7. headers = {
    8. "user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.0.0 Safari/537.36"
    9. }
    10. rep="http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+"
    11. listUrl = re.findall(rep, strList)
    12. list_not_dup = list()
    13. for i in listUrl:
    14. if i not in list_not_dup:
    15. list_not_dup.append(i)
    16. for item in list_not_dup:
    17. print(item)
    18. strUrl = ""
    19. for item in list_not_dup:
    20. html = requests.get(item, headers).url
    21. result = html.split("?")
    22. strUrl += result[0] + "\n"
    23. file = open("newUrl.txt", "w", encoding="utf-8")
    24. file.write(strUrl)
    25. file.close()

    推荐获取网页URL的正则

    "http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*,]|(?:%[0-9a-fA-F][0-9a-fA-F]))+" 

    解析结果newUrl.txt

    https://blog.csdn.net/CSDN_anhl/article/details/126240868
    https://blog.csdn.net/m0_63127854/article/details/126682845
    https://blog.csdn.net/m0_47419053/article/details/126679490
    https://yiyige.blog.csdn.net/article/details/120990448
    https://blog.csdn.net/weixin_60719453/article/details/126674166
    https://blog.csdn.net/qxhgd/article/details/115391385
    https://blog.csdn.net/m0_64102491/article/details/126673956
    https://blog.csdn.net/Fire_Cloud_1/article/details/126669683
    https://blog.csdn.net/L_Lycos/article/details/126614374
    https://blog.csdn.net/muzi_longren/article/details/126654597
    https://blog.csdn.net/m0_62159662/article/details/126653214
    https://blog.csdn.net/FMC_WBL/article/details/126683043
    https://blog.csdn.net/FMC_WBL/article/details/126575914
    https://blog.csdn.net/m0_54594153/article/details/126661839
    https://blog.csdn.net/weixin_40972073/article/details/126682094
    https://blog.csdn.net/BIT_666/article/details/126656554
    https://blog.csdn.net/feng8403000/article/details/126674232
    https://blog.csdn.net/qq_44631587/article/details/126667516
    https://blog.csdn.net/qq_43585922/article/details/126685211
    https://blog.csdn.net/m0_65909361/article/details/126599073
    https://blog.csdn.net/m0_68744965/article/details/126471630
    https://blog.csdn.net/m0_71485750/article/details/126427221
    https://blog.csdn.net/chuxinchangcun/article/details/126681915
  • 相关阅读:
    初识红黑树
    Python的常用排序算法实现
    关于使用 SAP UI5 代码设置应用 theme 的技巧
    win10超好看的鼠标主题,你也来试试吧
    自学网络安全?一般人我还是劝你算了吧
    登录注册页面的模拟
    深度学习相关VO梳理
    【申博攻略】六.如何联系心仪的导师以及前期注意事项
    MySQL binlog都有哪些模式?
    基于持续同调的在线社交网络传播研究
  • 原文地址:https://blog.csdn.net/feng8403000/article/details/126687760