avatar
Articles
159
Tags
23
Categories
4
Home
Tags
Ferry's blog
Search
Home
Tags

Ferry's blog

记录一次cookies免登录
Updated2025-02-18|scrape
cookies的作用cookies在我们浏览器的客户端,通过cookies我们可以把我们的个人标识信息传送到服务器端,而在服务器端正好有与我们cookies相对应的session包含个人信息(登录信息,偏好信息,等等)。当我们访问服务器时(每一个requests都带有cookies),服务器会response我们的个人信息。这样我们每一次访问同一个域名网站就不用一直填写登录信息了。 爬虫中cookies的应用在爬虫中,我们每发送一次request都要把cookies带上(之前就是忽略了某些request的cookies导致找了很久都不知道哪里错) 在浏览器的开发者选项中,我们都能找到我们的cookies,我们只需要复制他们,并改写成字典形式,便可以传递到request中。 原始cookies 1wordpress_test_cookie=WP%20Cookie%20check;...
scrapy
Updated2025-06-30|scrape
概况下面就是scrapy的大致框架图,我们先做一个案例再慢慢介绍 准备工作确保scrapy项目要在根目录上运行,原因是: Scrapy 项目通常是一个 Python 包,当你在 Scrapy 项目的根目录之外运行代码时,Python 可能无法正确找到 scrapytutorial 这个包,导致 无法解析 scrapytutorial.items。这是因为 Python 的 模块搜索路径 (sys.path) 不包含 Scrapy 项目的根目录。 创建scrapy项目 12scrapy startproject <project_name>scrapy startproject scrapytutorial 创建spider 12345cd <project_name>scrapy genspider <spider_name> <domain>cd scrapytutorialscrapy genspider quotes...
在终端中配置代理
Updated2025-06-25|Q&A
使用什么端口号?当我们用代理软件之后,要传送到哪个端口号呢?我们先说结论: 如果是在浏览器、终端、Python 代码 ,用本地代理 127.0.0.1:10808(Mixed Port) 如果是在 Clash/V2Ray 里配置远程代理 , 用远程 123.45.67.89:443(代理服务器端口) 为什么会这样呢?本地软件(浏览器、终端、Python 代码等) 只需要和 本地代理(Clash/V2Ray) 通信,而不需要直接访问远程代理服务器。 Clash/V2Ray 会在本地开启一个 Mixed Port(如 127.0.0.1:10808),这个端口会:✅ 接收 HTTP/SOCKS5 代理请求(浏览器、Python 代码等)✅ 自动选择最佳的远程代理服务器✅ 处理数据加密、分流等复杂逻辑 终端配置代理方法12set http_proxy=http://127.0.0.1:10808(v2ray提供给本地的mixed port)set...
JavaScript反爬虫原理
Updated2025-02-15|scrape
css和JS修改浏览器的DOM在我们可以浏览到的网页中,他们设置的反爬虫机制就是通过css和js修改原本html文件中的标签内容(把数据内容放到js中),使得我们无法直接从html文件中获得我们想要的数据。也就是利用了我们平常的爬虫工具中没有js解释器和css解释器这一个弊端达到了反爬的效果 而这个dom就是经浏览器渲染之后的标签。虽然css和js是不能修改html文件中的标签,但能修改dom。
python comprehensions
Updated2023-03-29|python
List ComprehensionList comprehension offers a shorter syntax when you want to create a new list based on the values of an existing list. The Syntaxnewlist = [expression for item in iterable if condition == True] The return value is a new list, leaving the old list unchanged. ConditionThe condition is like a filter that only accepts the items that valuate to True. IterableThe iterable can be any iterable object, like a list, tuple, set etc. ExpressionThe expression is the...
Failed to connect to github.com port 443 after 21193 ms: Timed out
Updated2023-04-13|git
这种连接不上github的问题也不是第一次见了,一直被这种问题困扰。 解决方法: 修改DNS: 114.114.114.114 or 其他
Tkinter
Updated2023-04-13|python
IntroductionThe foundational element of a Tkinter GUI is the window. Windows are the containers in which all other GUI elements live. These other GUI elements, such as text boxes, labels, and buttons, are known as widgets. windowThe first thing you need to do is import the Python GUI Tkinter module: 1import tkinter as tk A window is an instance of Tkinter’s Tk class. Go ahead and create a new window and assign it to the variable window: 1window = tk.Tk() widgetsUse the tk.Label class to...
Basic syntax
Updated2023-05-18|cpp
stringUser Input Stringscin considers a space (whitespace, tabs, etc) as a terminating character, which means that it can only display a single word (even if you type many words): 1234567string fullName;cout << "Type your full name: ";cin >> fullName;cout << "Your name is: " << fullName;// Type your full name: John Doe// Your name is: John That’s why, when working with strings, we often use the getline() function to read a line of text. It takes...
jQuery
Updated2025-06-09|JavaScript
jQuery IntroductionjQuery is a JavaScript Library. jQuery greatly simplifies JavaScript programming. jQuery also simplifies a lot of the complicated things from JavaScript, like AJAX calls and DOM manipulation. The jQuery library contains the following features: HTML/DOM manipulation CSS manipulation HTML event methods Effects and animations AJAX Utilities Tip: In addition, jQuery has plugins for almost any task out there. jQuery SyntaxThe jQuery syntax is tailor-made for selecting...
Ajax
Updated2025-06-09|JavaScript
AJAX IntroductionAJAX is a developer’s dream, because you can: Read data from a web server - after the page has loaded Update a web page without reloading the page Send data to a web server - in the background AJAX = Asynchronous JavaScript And XML. AJAX is not a programming language. AJAX just uses a combination of: A browser built-in XMLHttpRequest object (to request data from a web server) JavaScript and HTML DOM (to display or use the data) AJAX applications might use XML to...
123…16
avatar
FerryChan
Articles
159
Tags
23
Categories
4
Follow Me
Recent Posts
js tutorial2025-07-18
JS逆向2025-07-08
正则表达式2025-07-08
python模拟ajax请求2025-07-02
A Django Project2025-06-16
Tags
JavaScript DS git AI sql c python swing scrapy project csapp vue blog English algorithm fatal Q&A java django cpp linux scrape css
Archives
  • July 2025 4
  • June 2025 1
  • February 2025 9
  • March 2023 3
  • February 2023 3
  • May 2022 1
  • February 2022 4
  • January 2022 10
Website Info
Article Count :
159
Unique Visitors :
Page Views :
Last Update :
Framework Hexo|Theme Butterfly
Search
Loading Database