# htmlparser
**Repository Path**: Mocaris/htmlparser
## Basic Information
- **Project Name**: htmlparser
- **Description**: No description available
- **Primary Language**: Kotlin
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2019-06-03
- **Last Updated**: 2020-12-19
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
## 网页数据爬取,不定期更新
### 一.[🇨🇳中国城市数据爬取(五级联动)](src/main/java/com/mocaris/ChineseCityParser.kt)
同步[gitee链接](https://gitee.com/Mocaris/htmlparser)
同步[github链接](https://github.com/Mocaris/htmlparser)
参考
### ChineseCityParser
```kotlin
fun main(args: Array) {
val proxyParser = ProxyParser()
proxyParser.listener = object : ParseHtmlListener {
override fun onSuccess(ipProxys: MutableList) {
IP_PROXYS = ipProxys
println("代理ip:\n${ipProxys.toString()}")
val parser = ChineseCityParser()
parser.parseProvinceHtml()
}
override fun onFailed(erMsg: String) {
threadExecutor.execute(proxyParser)
}
}
threadExecutor.execute(proxyParser)
}
```
数据格式 导出可自行修改
```kotlin
fun writeFile(parent: CityModel?, models: List?) {
models ?: return
// threadExecutor.execute {
synchronized(this::class.java) {
val file = File("city_log.txt")
if (!file.exists()) {
file.createNewFile()
}
if (file.length() <= 0) {
outputStream.write("新建表名 tb_cities 可直接复制粘贴下面 sql 语句执行\n".toByteArray(Charsets.UTF_8))
outputStream.write(
"CREATE TABLE tb_cities (_id INTEGER PRIMARY KEY AUTOINCREMENT,parent_code TEXT,name TEXT,statistics_code TEXT,classification_code TEXT)\n".toByteArray(
Charsets.UTF_8
)
)
outputStream.write(
"INSERT INTO tb_cities (parent_code,name,statistics_code,classification_code) VALUES \n".toByteArray(
Charsets.UTF_8
)
)
outputStream.flush()
}
val sqlStr = StringBuilder()
for (city in models) {
sqlStr.append(" ('${city.parent_code}','${city.name}','${city.statistics_code}','${city.classification_code}')")
sqlStr.append(",\n")
}
val sql = sqlStr.toString()
outputStream.write(sql.toByteArray(Charsets.UTF_8))
outputStream.flush()
println(sql)
}
// }
}
```