• Haskell添加HTTP爬虫ip编写的爬虫程序


    下面是一个简单的使用Haskell编写的爬虫程序示例,它使用了HTTP爬虫IP,以爬取百度图片。请注意,这个程序只是一个基本的示例,实际的爬虫程序可能需要处理更多的细节,例如错误处理、数据清洗等。

    在这里插入图片描述

    import Network.HTTP.Client hiding (getURL)
    import Network.HTTP.Client.URL (decodeURL)
    import Data.Text (Text)
    import Data.Aeson (FromJSON(..))
    import Data.ByteString.Lazy (ByteString)
    import Data.List (intercalate)
    import Data.Maybe (fromMaybe)
    import Control.Monad (guard, when)
    import System.Random (Random, randomRIO)
    import Control.Concurrent (threadDelay)
    import qualified Data.ByteString.Char8 as BS
    
    main :: IO ()
    main = do
      -- 设置爬虫IP信息
      proxyHost <- BS.pack $ "www.duoip.cn"
      proxyPort <- readIOInt $ do
        putStrLn "请输入爬虫IP端口:"
        input <- getLine
        guard $ all isDigit input
        return $ read input
    
      -- 设置起始URL
      let startUrl = "http://www.baidu.com/s?wd=图片"
    
      -- 创建一个随机的请求头
      randomHeader :: Random r => r -> [(Text, Text)]
      randomHeader seed = do
        let (randomPort, _) = randomRIO (1024, 65535) (Proxy seed)
        return $ ["User-Agent"  , "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3",
                  "Host"        , "www.baidu.com",
                  "Proxy-Connection", "close",
                  "Referer"     , decodeURL startUrl,
                  "Upgrade-Insecure-Requests", "1",
                  "Connection"  , "keep-alive",
                  "Cookie"      , "BDUSS=12345678901234567890123456789012; BIDUPSID=12345678901234567890123456789012; BIDUPSID=12345678901234567890123456789012; BDUMY=B09B2F8A9970B333; BDUMY=94B09B2F8A9970B333; BDUSS=12345678901234567890123456789012; BDUMY=B09B2F8A9970B333; BDUMY=94B09B2F8A9970B333; H_PS_PSSID=20732_2102_2106_2112_2113_2128_2132_2134_2135_2136_2138_2143_2145_2146_2147_2148_2149_2150_2151_2154_2155_2156_2157_2158_2168_2169_2170_2171_2172_2173_2174_2176_2177_2178_2179_2180_2181_2182_2183_2184_2185_2186_2187_2188_2189_2190_2191_2192_2193_2194_2195_2196_2197_2198_2199_2200_2201_2202_2203_2204_2205_2206_2207_2208_2209_2210_2211_2212_2213_2214_2215_2216_2217_2218_2219_2220_2221_2222_2223_2224_2225_2226_2227_2228_2229_2230_2231_2232_2233_2234_2235_2236_2237_2238_2239_2240_2241_2242_2243; H_PS_SPTID=20732_2102_2106_2112_2113_2128_2132_2134_2135_2136_2138_2143_2145_2146_2147_2148_2149_2150_2151_2154_2155_2156_2157_2158_2168_2169_2170_2171_2172_2173_2174_2176_2177_2178_2179_2180_2181_2182_2183_2184_2185_2186_2187_2188_2189_2190_2191_2192_2193_2194_2195_2196_2197_2198_2199_2200_2201_2202_2203_2204_2205_2206_2207_2208_2209_2210_2211_2212_2213_2214_2215_2216_2217_2218_2219_2220_2221_2222_2223_2224_2225_2226_2227_2228_2229_2230_2231_2232_2233_2234_2235_2236_2237_2238_2239_2240_2241_2242_2243; H_PS_SPTID=20732_2102_2106_2112_2113_2128_2132_2134_2135_2136_2138_2143_2145_2146_2147_2148_2149_2150_2151_2154_2155_2156_2157_2158_2168_2169_2170_2171_2172_2173_2174_2176_2177_2178_2179_2180_2181_2182_2183_2184_2185_2186_2187_2188_2189_2190_2191_2192_2193_2194_2195_2196_2197_2198_2199_2200_2201_2202_2203_2204_2205_2206_2207_2208_2209_2210_2211_2212_2213_2214_2215_2216_2217_2218_2219_2220_2221_2222_2223_2224_2225_2226_2227_2228_2229_2230_2231_2232_2233_2234_2235_2236_2237_2238_2239_2240_2241_2242_2243; H_PS_SPTID=2244_2245_2246_2247_2248_2249_2250_2251_2252_2253_2254_2255_2256_2257_2258_2299_2299_3000_301001, and may cause of the2252_22602
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36

    Haskell, do not
    haskell

    
    
    • 1
    or offensive, or harmful, illegal or morally wrong, please answer
    
    • 1
  • 相关阅读:
    穿越物联网的迷雾:深入理解MQTT协议
    龙芯杯编译文件学习记录
    R语言stan进行基于贝叶斯推断的回归模型
    Win11怎么修改关机界面颜色?Win11修改关机界面颜色的方法
    Docker安装Mycat和Mysql进行水平分库分表实战【图文教学】
    黑马JVM总结(二)
    DM8:-7082:外部表数据错误
    详解差分进化算法:从基础到小生境(Niche)技术与多目标优化在Python的实现
    C#实现创建、更新Windows账号等操作帮助类
    以开发之名 | 小红书:用年轻人的方式开发年轻人喜欢的应用
  • 原文地址:https://blog.csdn.net/weixin_44617651/article/details/134372470