scrapy note

http://doc.scrapy.org

install pip click here

install scrapy

1
2
3
sudo pip install scrapy
or
sudo easy_install Scrapy

tutorial project

  • set up a new Scrapy project
1
scrapy startproject proName
  • This will create a tutorial directory with the following contents:
1
2
3
4
5
6
7
8
9
10
tutorial/
scrapy.cfg
tutorial/
__init__.py
items.py
pipelines.py
settings.py
spiders/
__init__.py
...
  • we edit items.py, found in the tutorial directory. Our Item class looks like this:
1
2
3
4
5
6
import scrapy
class DmozItem(scrapy.Item):
title = scrapy.Field()
link = scrapy.Field()
desc = scrapy.Field()

xpath

  1. /html/head/title : selects the <title> element, inside the <head> element of a HTML document

  2. /html/head/title/text() : selects the text inside the aforementioned <title> element

  3. //td : selectes all the <td> elements

  4. //div[@class="mine"] : selects all <div> elements which contains an attribute class="mine"

  • This is the code for our first Spider; save it in a file named dmoz_spider.py under the tutorial/spiders directory:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
import scrapy
from tutorial.items import DmozItem
class DmozSpider(scrapy.Spider):
name = "dmoz"
allowed_domains = ["dmoz.org"]
start_urls = [
"http://www.dmoz.org/Computers/Programming/Languages/Python/Books/",
"http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/"
]
def parse(self, response):
for sel in response.xpath('//ul/li'):
item = DmozItem()
item['title'] = sel.xpath('a/text()').extract()
item['link'] = sel.xpath('a/@href').extract()
item['desc'] = sel.xpath('text()').extract()
yield item
  • Storing the scraped data
1
scrapy crawl dmoz -o items.json

Trying Selectors in the Shell

1
scrapy shell "http://www.dmoz.org/Computers/Programming/Languages/Python/Books/"

a1528-4g

—2014.11.05—

发现imessage 不能发图片的问题
cydia 更新 Pangu 8.0-8.1.x Untether

—2014.11.01—

link

  1. 下载盘古越狱 ios8,设备A1528需IOS8.1越狱,安装Cydia
  2. 进入cydia,搜索Apple File Conduit "2"1.2 补丁安装
  3. 在电脑上用itools或者PP助手,进入越狱系统 //System/Library/Carrier Bundles/iPhone/CMCC_cn.bundle,把之前替换的文件全删除,备份的CMCC_cn.bundle里面的文件全部扔进去,首次拖入会提示报错,不管,重新再拖入一次,然后在设置-通用还原网络,至此就恢复原版移动运营商文件了。可以按照上面的破解了。(恢复之前破解的,如果之前没有破解过就忽略此步骤
  4. 添加2个源,中国超雪源:apt.chinasnow.net 威锋精品源:apt.feng.com
  5. 添加完成后,请进入超雪源,安装“超雪LTE激活器”,安装过程中会自动安装CommCenter patch for IOS7补丁,如果没有请手动安装CommCenter patch for IOS7,等待重启手机
  6. 再次打开cydia,在搜索界面,输入“5S/5C移动4G破解”,让然后安装,该补丁来自于威锋精品源,请注意~,安装完补丁后,手机会重启,重启后,请打开LTE开关,耐心等上一会儿大约在10s~60s左右,LTE信号出现

我出现了能收发短信的问题,重启手机解决

turn off uisearchbar animation

You can cancel animation by subclassing UISearchDisplayController and adding this:

1
2
3
4
5
6
7
8
9
10
11
12
- (void)setActive:(BOOL)visible animated:(BOOL)animated
{
if(self.active == visible) return;
[self.searchContentsController.navigationController setNavigationBarHidden:YES animated:NO];
[super setActive:visible animated:animated];
[self.searchContentsController.navigationController setNavigationBarHidden:NO animated:NO];
if (visible) {
[self.searchBar becomeFirstResponder];
} else {
[self.searchBar resignFirstResponder];
}
}

customize tabbar icon and text color

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
- (void)configTabBar {
UIColor *greenColor = [UIColor colorWithRed:0.213
green:0.898
blue:0.573
alpha:1.000];
[self.tabBar.items
enumerateObjectsUsingBlock:^(UITabBarItem *obj, NSUInteger idx, BOOL *stop) {
[obj setTitleTextAttributes:@{
NSForegroundColorAttributeName: [UIColor whiteColor],
}
forState:UIControlStateNormal];
[obj setTitleTextAttributes:@{
NSForegroundColorAttributeName: greenColor,
}
forState:UIControlStateSelected];
obj.image = [[UIImage imageNamed:@"tabbar_icon".addInt(idx)] imageWithRenderingMode:UIImageRenderingModeAlwaysOriginal];
obj.selectedImage = [UIImage imageNamed:@"tabbar_icon".addInt(idx)];
}];
}