1. HTML
  2. LAMP
  3. PHP
  4. Python

Python,PHP使用带cookie 的POST请求爬取数据

昨天逛学校校园卡服务平台发现只要输入校区,宿舍楼,房间号就可以查询电费余额,没有额外的身份验证

立即 F12 分析一波

General

Request URL : http://ecard.****.edu.cn/AutoPay/PowerFee/GetPowerBalance

Request Method : POST

Status Code : 200 OK

Remote Address : ***.***.***.***:80

Referrer Policy : no-referrer-when-downgrade


Form Data

payTypeCode : Sims
xiaoqu : 3
buildno : 1
roomno : 0218102


先试一试用PHP,HTML来伪造表单

<?php 
function  spider($i) { 
?>
<form id = "test" method = "post" action = "http://ecard.****.edu.cn/AutoPay/PowerFee/GetPowerBalance">
<input name = "payTypeCode" value = "Sims" >
<input name = "xiaoqu" value = "3" >
<input name = "buildno" value = "1" >
<input name = "roomno" value = "<?php echo '0218'.$i;?>" >
<input type = "submit" value = "查看" />
</form>
<script>
test.submit();
</script>
<?php    
} 

for($i=101;$i<=129;$i++){
	spider($i); 
}
?>

走你。。。

咋只有一条记录???

F12 发现就产生了一次请求

感觉执行一次提交表单后就跳转到另外的页面了,所以PHP循环提交表单这样写不行

 

PHP我是写不下去了,该用Py了 (滑稽)

#coding=utf-8
#2018-09-08 v0.1
import requests
url = 'http://ecard.****.edu.cn/AutoPay/PowerFee/GetPowerBalance'
#循环楼层
for floor in range(1,7):
    
    start = floor*100+1
    end = floor*100+30

    #循环每一层的房间号
    for num in range(start,end):
        #请求头
        Hostreferer = {
            'Host': 'ecard.****.edu.cn',
            'Connection': 'keep-alive',
            'Content-Length': '50',
            'Cache-Control': 'max-age=0',
            'Origin': 'http://ecard.****.edu.cn',
            'Upgrade-Insecure-Requests': '1',
            'Content-Type': 'application/x-www-form-urlencoded',
            'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.81 Safari/537.36',
            'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
            'Referer': 'http://ecard.****.edu.cn/',
            'Accept-Encoding': 'gzip, deflate',
            'Accept-Language': 'zh-TW,zh;q=0.9,en-US;q=0.8,en;q=0.7',
            'Cookie': '********************************************************************'
        }
        #post参数
        params = {"payTypeCode":"Sims","xiaoqu":"3","buildno":"2","roomno":"0219"+str(num)}

        re = requests.post(url,headers = Hostreferer,data = params)
        print("房间号:" + str(num) + "电费余额:"+re.text)
        

走你。

房间号:101电费余额:-0.01
房间号:102电费余额:0
房间号:103电费余额:-0.01
房间号:104电费余额:85.65
房间号:105电费余额:88.12
房间号:106电费余额:5.69
房间号:107电费余额:53.56
房间号:108电费余额:35.72
房间号:109电费余额:0
房间号:110电费余额:2.33
房间号:111电费余额:5.97
房间号:112电费余额:-0.02
房间号:113电费余额:49.12
房间号:114电费余额:92.35
房间号:115电费余额:42.18
房间号:116电费余额:106.95
房间号:117电费余额:33.3
房间号:118电费余额:-0.01
房间号:119电费余额:0.55
房间号:120电费余额:7.47
房间号:121电费余额:-0.01
房间号:122电费余额:3.07
房间号:123电费余额:-0.01
房间号:124电费余额:-0.01
房间号:125电费余额:0
房间号:126电费余额:0
房间号:127电费余额:0
房间号:128电费余额:0
房间号:129电费余额:0
房间号:201电费余额:105.63
房间号:202电费余额:0
房间号:203电费余额:1.52
房间号:204电费余额:-0.01
房间号:205电费余额:-0.01
房间号:206电费余额:27.35
房间号:207电费余额:7.1
房间号:208电费余额:11.16
房间号:209电费余额:-0.01
房间号:210电费余额:0
房间号:211电费余额:0
房间号:212电费余额:38.82
房间号:213电费余额:63.11
房间号:214电费余额:-1.63
房间号:215电费余额:64.97
房间号:216电费余额:19.31
房间号:217电费余额:51.83
房间号:218电费余额:7.31
房间号:219电费余额:0
房间号:220电费余额:4.85
房间号:221电费余额:172.54
房间号:222电费余额:19.98
房间号:223电费余额:0.03
房间号:224电费余额:16.6
房间号:225电费余额:28.22
房间号:226电费余额:0
房间号:227电费余额:10.66
房间号:228电费余额:38.97
房间号:229电费余额:该房间号不存在!
房间号:301电费余额:23.94
房间号:302电费余额:49.75
房间号:303电费余额:321.89
房间号:304电费余额:14.32
房间号:305电费余额:45.88
房间号:306电费余额:85.61
房间号:307电费余额:22.17
房间号:308电费余额:91.35
房间号:309电费余额:23.73
房间号:310电费余额:20.15
房间号:311电费余额:15.98
房间号:312电费余额:59.93
房间号:313电费余额:11.17
房间号:314电费余额:40.22
房间号:315电费余额:18.51
房间号:316电费余额:43.35
房间号:317电费余额:26.8
房间号:318电费余额:62.61
房间号:319电费余额:73.4
房间号:320电费余额:91.36
房间号:321电费余额:36.72
房间号:322电费余额:72.3
房间号:323电费余额:6
房间号:324电费余额:60.84
房间号:325电费余额:44.51
房间号:326电费余额:51.8
房间号:327电费余额:53.18
房间号:328电费余额:-0.01
房间号:329电费余额:0
房间号:401电费余额:2.34
房间号:402电费余额:106.78
房间号:403电费余额:0
房间号:404电费余额:106.54
房间号:405电费余额:69.52
房间号:406电费余额:-0.01
房间号:407电费余额:14.4
房间号:408电费余额:0
房间号:409电费余额:10.48
房间号:410电费余额:-0.01
房间号:411电费余额:45.81
房间号:412电费余额:105.71
房间号:413电费余额:107.14
房间号:414电费余额:-0.01
房间号:415电费余额:0
房间号:416电费余额:15.62
房间号:417电费余额:-0.01
房间号:418电费余额:20.31
房间号:419电费余额:3.15
房间号:420电费余额:0
房间号:421电费余额:-0.01
房间号:422电费余额:0
房间号:423电费余额:0
房间号:424电费余额:31.82
房间号:425电费余额:0
房间号:426电费余额:67.34
房间号:427电费余额:23.98
房间号:428电费余额:32.4
房间号:429电费余额:0
房间号:501电费余额:164.51
房间号:502电费余额:24.06
房间号:503电费余额:50.87
房间号:504电费余额:20.11
房间号:505电费余额:46.83
房间号:506电费余额:62.23
房间号:507电费余额:13.67
房间号:508电费余额:3.48
房间号:509电费余额:0
房间号:510电费余额:0
房间号:511电费余额:27.43
房间号:512电费余额:4.88
房间号:513电费余额:0
房间号:514电费余额:6.03
房间号:515电费余额:72.01
房间号:516电费余额:53.17
房间号:517电费余额:30.62
房间号:518电费余额:47.76
房间号:519电费余额:56.42
房间号:520电费余额:18.09
房间号:521电费余额:65.31
房间号:522电费余额:40.91
房间号:523电费余额:72.77
房间号:524电费余额:47.03
房间号:525电费余额:123.24
房间号:526电费余额:34.03
房间号:527电费余额:30.16
房间号:528电费余额:-0.01
房间号:529电费余额:0
房间号:601电费余额:101.87
房间号:602电费余额:5.96
房间号:603电费余额:17.21
房间号:604电费余额:21.65
房间号:605电费余额:36.78
房间号:606电费余额:13.89
房间号:607电费余额:97.52
房间号:608电费余额:-0.02
房间号:609电费余额:8.45
房间号:610电费余额:19.69
房间号:611电费余额:38.76
房间号:612电费余额:63.32
房间号:613电费余额:24.73
房间号:614电费余额:45.22
房间号:615电费余额:7.74
房间号:616电费余额:29
房间号:617电费余额:3.99
房间号:618电费余额:32.1
房间号:619电费余额:59.46
房间号:620电费余额:66.05
房间号:621电费余额:105.65
房间号:622电费余额:45.99
房间号:623电费余额:1.43
房间号:624电费余额:37.8
房间号:625电费余额:-0.01
房间号:626电费余额:-0.01
房间号:627电费余额:-0.01
房间号:628电费余额:-0.01
房间号:629电费余额:0
>>>

轻轻松松219栋的电费余额就全部爬取出来了

下一步写个GUI查电费???


拿以前写的 “资讯快抓” 魔改一下  蛤蛤

# -*- coding: utf-8 -*-
import requests
import re
import wx
    
class Frame(wx.Frame):  
    def __init__(self):
        wx.Frame.__init__(self, None, -1, '电费查询[219#305专用]',size=(400, 180))
        self.text = wx.TextCtrl(self, style=wx.TE_MULTILINE,size=(200,100),pos=(10, 10),value="这是219#305专用的实时电费余额查询工具,点击右侧【一键查询】即可使用")
        self.button = wx.Button(self, -1, u"     一键查询     ", pos=(240, 20))
        self.button.Bind(wx.EVT_BUTTON, self.OnClick, self.button)

        self.button2 = wx.Button(self, -1, u"       关于         ", pos=(240, 70))
        self.button2.Bind(wx.EVT_BUTTON, self.OnClick2, self.button2)
        self.Show(True)
        
    def OnClick(self, event):
        #http请求头
        url = 'http://ecard.wxc.edu.cn/AutoPay/PowerFee/GetPowerBalance'
        Hostreferer = {
            'Host': 'ecard.wxc.edu.cn',
            'Connection': 'keep-alive',
            'Content-Length': '50',
            'Cache-Control': 'max-age=0',
            'Origin': 'http://ecard.****.edu.cn',
            'Upgrade-Insecure-Requests': '1',
            'Content-Type': 'application/x-www-form-urlencoded',
            'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.81 Safari/537.36',
            'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
            'Referer': 'http://ecard.****.edu.cn/',
            'Accept-Encoding': 'gzip, deflate',
            'Accept-Language': 'zh-TW,zh;q=0.9,en-US;q=0.8,en;q=0.7',
            'Cookie': '**************************************************'
        }
        params = {"payTypeCode":"Sims","xiaoqu":"3","buildno":"2","roomno":"0219305"}
        params2 = {"payTypeCode":"Sims","xiaoqu":"3","buildno":"2","roomno":"219305"}

        kt = requests.post(url,headers = Hostreferer,data = params)
        out = "空调电费余额:"+str(kt.text)+"\n"
        zm = requests.post(url,headers = Hostreferer,data = params2)
        out+="照明电费余额:"+str(zm.text)+"\n"
        self.text.SetValue(out)
    def OnClick2(self, event):
        wx.MessageBox('Copyright © 2018 SENCOM. All Rights Reserved','关于',wx.OK|wx.ICON_INFORMATION)

         
if __name__ == '__main__':
    app = wx.App()
    frame = Frame()
    frame.Show()
    app.MainLoop()
    

 

ps:需要在本校的校园网内才能使用哈!