Pages

Friday, September 29, 2017

Bypassing cloudflare using python


1. Install cfscrape


    pip install cfscrape




2. Install nodejs in OS 


    yum -y install nodejs

Ref : https://github.com/Anorov/cloudflare-scrape 

Example:

import  json,requests,bs4,cfscrape


scraper = cfscrape.create_scraper() 
session_requests = requests.session()


result=scraper.get("blockedsite,com",proxies=proxyDict)

Once bypassed then session_requests object can access the site directly





Wednesday, September 6, 2017

Python script to get Google Search results









Python script to get Google Search results


Location :
https://github.com/sterin501/PythonScripts/blob/master/TorrentScrap/googleSearch.py


Requirements


In windows torrentSearch.exe can be used


1. Install json


    pip install json


2.Install BeautifulSoup4

    pip install BeautifulSoup4
3.Install requests

    pip install requests




Option A : ( Best option)


Scrap using Google API , is the best option . It provide data in json format .


1. Get Google API key
2. Change the settings for custom search to use entire web






Option B: Scrap from google.com


This is not  good option , google changes output format every time

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
#!/bin/python
import  json,requests,bs4


session_requests = requests.session()

configJson=json.load(open('config.json'))

googleResultCount=configJson['googleResultCount']
googleKey=configJson['googleKey']
gooelcx=configJson['gooelcx']
googleURLend=configJson['googleURLend']

proxyDict = {
  "http" : configJson['proxyserver'] ,
  "https":configJson['proxyserver'],
}


def getGoogleAPI(keyword):

  pnr_data =                 {

                            'q'      : keyword,
                        'googleHost' : 'google.co.in',
                            'num' :  googleResultCount,
                            'key'    : googleKey,
                            'cx'     : gooelcx
                              }
  url="https://www.googleapis.com/customsearch/v1"
  result = session_requests.get(url,params=pnr_data,proxies=proxyDict)
  results = json.loads(result.content)
  data = results['items']
  URLS=[]
  for kk in data:
    print kk['link']
    URLS.append(kk['link'])
  return URLS 


def getFrommGoogleCOM(keyword):
    pnr_data =                 {


                            'q'      : keyword,
                            'gws_rd' : "cr"
                              }
   
    url="https://www.google.co.in/search"
    result = session_requests.get(url,params=pnr_data,proxies=proxyDict)
    #print result.content
    soup = bs4.BeautifulSoup(result.content,"lxml")
    #with open ("result.html", "r") as myfile:
    #   LKD = myfile.read()
    #soup =   bs4.BeautifulSoup(LKD,"lxml") 
    hrf=soup.find_all('a', href=True)
    URLS=[]
    for kk in hrf:
      url1 = kk['href']
      if  url1.startswith("/url?q="):
           url2=url1.split("http")[-1]
           url3=url2.split(googleURLend)[0]
           URLS.append("http"+url3)
    print URLS



 
if __name__ == '__main__':
 URLS=getGoogleAPI("malayalam")
 print URLS
 getFrommGoogleCOM("Ajith lv")









Wednesday, August 23, 2017

Python script to get valid tamilrockers domain and get latest malayalam movie links









Python script to get valid tamilrockers domain and get latest malayalam movie links


Location :


Requirements


In windows torrentSearch.exe can be used


1. Install lxml


    pip install lxml
    pip install lxml==3.6.0

2.Install BeautifulSoup4

    pip install BeautifulSoup4

3.Install requests

    pip install requests

4. Run the server script :
    ./torrentSearch.py


Read me


1. It will search for valid domain based from domain.txt
2. From google search result , it will check each and very URL 
3. One valid torrent site found, it will look for malayalam folder
4. it will save the films on movie.txt
5. Comment # starts of the line will make script to look for movie name again from torrent site
6. Delete the movie Name from movie.txt , script will look for movie Link again from torrent site

Example of output :

./torrentSeacrch.py 
From history domain name is lv
Trying with http://tamilrockers.lv
Blocked
Will search in google 
https://play.google.com/store/apps/details?id=tamilrockers.movies&hl=en
http://9to5google.com/2016/12/01/how-to-download-movies-and-shows-in-the-netflix-app-for-android/
http://tamilrockers.nz/
https://www.youtube.com/watch?v=pYWKcRlFnOM
http://playtamil.in/Tamilrockers-movies/
[u'http://tamilrockers.nz']
Trying with http://tamilrockers.nz
GoodURL URL
working domain http://tamilrockers.nz
will do things with http://tamilrockers.nz/index.php/forum/124-malayalam-movies/
^^^^^^^^^^^^^^^^^^^^^^^^^^^
New Movie found Avarude Raavukal (2017) 
Download Link
http://tamilrockers.nz/index.php/topic/59461-avarude-raavukal-2017-malayalam-dvdrip-xvid-mp3-700mb-esubs/       
***************************
New Movies ['Avarude Raavukal (2017) ']



Tuesday, July 18, 2017

Python script to scrape score from cricbuzz with desktop alert









Python script to scrape score from cricbuzz with desktop alert


Location :


Requirements




1. Install lxml


    pip install lxml
    pip install lxml==3.6.0


2.Install BeautifulSoup4

    pip install BeautifulSoup4
3.Install pypiwin32 in windows for desktop alert

    pip install pypiwin32

in Linux



    pip install plyer


4. Run the script

























Alert (gnome)