I have deployed a scrapy project on an ec2. I need a splash for scrapping some websites. Splash is working fine on my local machine but in ec2 it gets 504 times out error, I have changed the max_timeout to 3600. But it's the same.
The ec2 has 2 GB ram.
Any help or suggestion is appreciated
yes
I also have some websites in the scrap project that has no need for splash they are working fine
- shovon 4 years, 8 months agohttps://github.com/scrapy-plugins/scrapy-splash/issues/28 check this , try some concurrency or set time out in request param
- Vengati set the time out to 90 same problem
and I need the images I am only running the splash is for the images
- shovon 4 years, 8 months agoSo what I understand is after splash enabled even small sites are not working right? Can you share the code details in order to investigate further.
- Vengat
Hello Every one the problem was solved using this package
https://github.com/TeamHG-Memex/aquarium
Excellent. Can you share more detail how this package resolved issue so that it would helpful for everyone.
- Vengat
@ Shovon . Did you check the EC2 outbound polciy and make sure its doesnt have any restriction on it. May be you can try ping the website from your EC2 instance and make sure you are getting response or not. If doesnt then you need to allow the host in AWS firewall.