Use Tumblr crawler and h5ai to create private video and picture libraries

Use Tumblr crawler and h5ai to create private video and picture libraries

Tumblr is rich in content, especially pictures and videos. Next, we use the Tumblr crawler, combined with the h5ai directory direct reading program, or other image storage programs, to create a gallery.
CentOS 7 comes with Python 2.7. Below we use CentOS 7 and Python 2.7, combined with lamp, and use h5ai to create a gallery.

1. Install and use h5ai
1. Install the lamp one-click installation package

 yum -y install wget screen wget http://mirrors.linuxeye.com/lnmp-full.tar.gz tar xzf lnmp-full.tar.gz cd lnmp screen -S lnmp ./install.sh

During the installation process, select Apache, PHP, Mysql, etc., and others are optional.

2. Create a site

 ./vhost.sh

Just follow the instructions. For example, the last created site is t.sib8.net
For detailed tutorials, please see: OneinStack: lnmp, lamp, lnmpa one-click installation package (supports HHVM)

3. Install h5ai
Enter the t.sib8.net directory

 cd /data/wwwroot/t.sib8.net/ wget https://down.zhujiwiki.com/code/h5ai-0.29-mod.zip unzip h5ai-0.29-mod.zip

4. Modify the configuration file

 vi /usr/local/apache/conf/vhost/t.sib8.net.conf

Bundle

 DirectoryIndex index.html index.php

Change to

 DirectoryIndex index.html index.php /_h5ai/server/php/index.php

Restart Apache

 service httpd restart

5. Install ffmpeg to preview the video and install the epel extension source

 yum -y install epel-release

Install the extension source:

 su -c 'yum localinstall --nogpgcheck https://download1.rpmfusion.org/free/el/rpmfusion-free-release-7.noarch.rpm https://download1.rpmfusion.org/nonfree/el/rpmfusion-nonfree-release-7.noarch.rpm' rpm --import http://li.nux.ro/download/nux/RPM-GPG-KEY-nux.ro rpm -Uvh http://li.nux.ro/download/nux/dextop/el7/x86_64/nux-dextop-release-0-1.el7.nux.noarch.rpm

Start Installation

 yum -y install ffmpeg ffmpeg-devel

2. Use tumblr-crawler
1. Install possible dependencies

 yum install openssl-devel bzip2-devel expat-devel gdbm-devel readline-devel sqlite-devel yum -y install gcc automake autoconf libtool make yum install gcc gcc-c++ yum -y install readline-devel

2. Install tumblr-crawler

 cd /data/wwwroot/t.sib8.net/ git clone https://github.com/dixudx/tumblr-crawler.git cd tumblr-crawler pip install -r requirements.txt

3. Use tumblr-crawler to download pictures and videos
a. Add tumblr sites in sites.txt, such as wanimal1983.tumblr.com and ma-tro.tumblr.com
wanimal1983,cncn88
After saving, run

 python tumblr-photo-video-ripper.py

b. Direct download

 python tumblr-photo-video-ripper.py wanimal1983,ma-tro

4. All pictures and videos are saved in a folder with the same name as the tumblr blog in the current path

5. Combine tumblr_spider to get multiple users and video addresses

3. Use scheduled tasks to automatically update videos and pictures

 crontab -e

Add to

 0 */6 * * * python /data/wwwroot/t.sib8.net/tumblr-crawler/tumblr-photo-video-ripper.py

This means it is updated every 6 hours.

<<:  Serveroffer: €12/month/E8400/3GB RAM/500GB HDD/Unlimited traffic/Lithuania

>>:  Wishosting: $3.99/month/2GB RAM/400GB hard drive/unlimited traffic/KVM/France/Canada

Recommend

Mellowhost: $2.5/month/1GB RAM/25GB SSD space/1TB bandwidth/KVM/Phoenix

Mellowhost, an American hosting company, was foun...

GigsGigsCloud 1GB RAM 150Mbps Port Japan SoftBank KVM VPS Review

Details : GigsGigsCloud: $45.6/year/512MB memory/...

GreenCloudVPS: $5/month/2GB RAM/20GB SSD/2TB bandwidth/OpenVZ/Los Angeles

GreenCloudVPS was established in 2013 and has bee...

ITLDC 2GB RAM SSD Hard Drive Singapore KVM VPS Review

Details: ITLDC: €21/year/1GB RAM/10GB SSD space/u...

IndoVirtue: $7/month/512MB/10GB SSD space/400GB traffic/OpenVZ/Singapore HostSG

IndoVirtue, founded in late 2010, offers Singapor...

How to choose and buy the most suitable US VPS for yourself?

With the rapid development of the Internet, many ...

DesiVPS: $3/month/1GB RAM/20GB SSD space/unlimited traffic/1Gbps/KVM/Los Angeles

DesiVPS, an Indian merchant (GST no: 27ADTFS5681D...

My Custom Hosting: $10/year/64MB RAM/5GB storage/100GB bandwidth/KVM

Introduction My Custom Hosting is a Canadian host...