Install Baidu Cloud Crawler on Centos 7

Install Baidu Cloud Crawler on Centos 7

The crawler runs under MySQL, Python 2.7, and Mysql-python, so install MySQL and MySQL-python first.

1. Install MySQL
Install Dependencies

 yum install libaio

Install MySQL

 wget http://dev.mysql.com/get/mysql-community-release-el7-5.noarch.rpm yum localinstall mysql-community-release-el7-5.noarch.rpm yum install mysql-community-server

Start MySQL

 systemctl start mysqld

Set MySQL password

 mysql_secure_installation;

2. Firewall settings <br />Install iptables

 yum install iptables-services

Open port 3306

 vi /etc/sysconfig/iptables

Add to

 -A RH-Firewall-1-INPUT -m state –state NEW -m tcp -p tcp –dport 3306 -j ACCEPT -A RH-Firewall-1-INPUT -m state –state NEW -m udp -p udp –dport 3306 -j ACCEPT

Restart iptables

 service iptables restart

3. Install MySQL-python

 yum install MySQL-python

4. Set up the program

 wget https://github.com/x-spiders/baiduyun-spider/archive/master.zip unzip master.zip cd baiduyun-spider-master

Set the account and password for connecting to the database

打開bin/spider.py ,修改DB_HOST、DB_PORT、DB_USER、DB_PASS

Running the crawler for the first time

 python bin/spider.py --seed-user

Run the crawler

 python bin/spider.py

Source code: https://geekspider.org/senior/215.html

<<:  QuickPacket: $40/month/E3-1270/16GB memory/1TB hard disk/20TB traffic/5 IP/Los Angeles

>>:  DexHost: $5/month/1GB RAM/30GB SSD hard drive/2TB bandwidth/OpenVZ/Los Angeles

Recommend

Hostens: €11.88/year/768MB RAM/15GB HDD/1TB bandwidth/OpenVZ/Lithuania

Hostens, a European hosting company, officially c...

Tencent launches free 1-year GeoTrust DV SSL certificate

Address: https://console.qcloud.com/ssl It does n...

BudgetVM: $23.6/month/4GB RAM/250GB HDD/5TB bandwidth/Los Angeles

BudgetVM, has its own data center. There are huge...

GinerNet: €9.99/year/512MB RAM/5GB SSD space/250GB traffic/OpenVZ/Spain

GinerNet, a Spanish merchant, mainly provides Spa...

YourLastHost: $2.57/month/512MB RAM/15GB space/2TB bandwidth/KVM/Los Angeles

YourLastHost, a US hosting provider, is a formall...

YesUpHost: $29/month/I3-2100/16GB RAM/500GB HDD/50TB bandwidth/Canada

YesUpHost is a formally registered company in Can...

NodeQuery: Linux VPS Monitoring and Information

NodeQuery is a website that provides free VPS mon...

MegaZoneHosting: $1/year/5GB SSD space/300GB bandwidth/New York

MegaZoneHosting is a new merchant that mainly pro...