Use Tumblr crawler and h5ai to create private video and picture libraries

Use Tumblr crawler and h5ai to create private video and picture libraries

Tumblr is rich in content, especially pictures and videos. Next, we use the Tumblr crawler, combined with the h5ai directory direct reading program, or other image storage programs, to create a gallery.
CentOS 7 comes with Python 2.7. Below we use CentOS 7 and Python 2.7, combined with lamp, and use h5ai to create a gallery.

1. Install and use h5ai
1. Install the lamp one-click installation package

 yum -y install wget screen wget http://mirrors.linuxeye.com/lnmp-full.tar.gz tar xzf lnmp-full.tar.gz cd lnmp screen -S lnmp ./install.sh

During the installation process, select Apache, PHP, Mysql, etc., and others are optional.

2. Create a site

 ./vhost.sh

Just follow the instructions. For example, the last created site is t.sib8.net
For detailed tutorials, please see: OneinStack: lnmp, lamp, lnmpa one-click installation package (supports HHVM)

3. Install h5ai
Enter the t.sib8.net directory

 cd /data/wwwroot/t.sib8.net/ wget https://down.zhujiwiki.com/code/h5ai-0.29-mod.zip unzip h5ai-0.29-mod.zip

4. Modify the configuration file

 vi /usr/local/apache/conf/vhost/t.sib8.net.conf

Bundle

 DirectoryIndex index.html index.php

Change to

 DirectoryIndex index.html index.php /_h5ai/server/php/index.php

Restart Apache

 service httpd restart

5. Install ffmpeg to preview the video and install the epel extension source

 yum -y install epel-release

Install the extension source:

 su -c 'yum localinstall --nogpgcheck https://download1.rpmfusion.org/free/el/rpmfusion-free-release-7.noarch.rpm https://download1.rpmfusion.org/nonfree/el/rpmfusion-nonfree-release-7.noarch.rpm' rpm --import http://li.nux.ro/download/nux/RPM-GPG-KEY-nux.ro rpm -Uvh http://li.nux.ro/download/nux/dextop/el7/x86_64/nux-dextop-release-0-1.el7.nux.noarch.rpm

Start Installation

 yum -y install ffmpeg ffmpeg-devel

2. Use tumblr-crawler
1. Install possible dependencies

 yum install openssl-devel bzip2-devel expat-devel gdbm-devel readline-devel sqlite-devel yum -y install gcc automake autoconf libtool make yum install gcc gcc-c++ yum -y install readline-devel

2. Install tumblr-crawler

 cd /data/wwwroot/t.sib8.net/ git clone https://github.com/dixudx/tumblr-crawler.git cd tumblr-crawler pip install -r requirements.txt

3. Use tumblr-crawler to download pictures and videos
a. Add tumblr sites in sites.txt, such as wanimal1983.tumblr.com and ma-tro.tumblr.com
wanimal1983,cncn88
After saving, run

 python tumblr-photo-video-ripper.py

b. Direct download

 python tumblr-photo-video-ripper.py wanimal1983,ma-tro

4. All pictures and videos are saved in a folder with the same name as the tumblr blog in the current path

5. Combine tumblr_spider to get multiple users and video addresses

3. Use scheduled tasks to automatically update videos and pictures

 crontab -e

Add to

 0 */6 * * * python /data/wwwroot/t.sib8.net/tumblr-crawler/tumblr-photo-video-ripper.py

This means it is updated every 6 hours.

<<:  Serveroffer: €12/month/E8400/3GB RAM/500GB HDD/Unlimited traffic/Lithuania

>>:  Wishosting: $3.99/month/2GB RAM/400GB hard drive/unlimited traffic/KVM/France/Canada

Recommend

Versaweb: $14.5/month/8G memory/100GB SSD hard drive/15TB traffic/2 IP/Las Vegas

Versaweb, an American hosting provider, has its o...

Cloudcom: 2.75 EUR/month/512MB memory/20GB space/1 TB traffic/1 IP/XEN

Cloudcom, a Swiss hosting provider, was founded i...

ovh: 5 free .eu top-level domains per account

Registration address: http://www.ovh.co.uk/domain...

UMaxHosting: $3.5/month/1GB memory/30GB space/3TB traffic/Windows/Los Angeles

UMaxHosting is a newly established hosting provid...

Hostient: $5/year/1GB space/10GB traffic/unlimited domain names can be bound

Previously introduced: https://zhujiwiki.com/sear...

Contabo 4 cores @ AMD 8GB memory 200Mbps bandwidth US KVM VPS review

Details : Contabo: €5.99/month/4 cores/8GB RAM/20...

StableHost: $1.5/month/unlimited space/unlimited traffic/Phoenix

StableHost, a stable merchant, has previously int...