你好,欢迎来到经管之家 [登录] [注册]

设为首页 | 经管之家首页 | 收藏本站

MP3下载器的设计与实现_通信工程专业论文范文

发布时间:2015-01-24 来源:人大经济论坛
通信工程专业论文 目 录 1绪论1 1.1课题的背景和目的1 1.2国内外研究现状及趋势1 1.2.1 搜索引擎1 1.2.2文件下载2 1.3课题研究的内容和意义3 1.4本文的结构4 2 技术概述5 2.1正则匹配5 2.2XML5 2.3搜索引擎的原理6 2.4线程7 2.4.1线程7 2.4.2多线程8 2.5MP3标签信息9 2.6HTTP协议9 2.7PageRank算法10 2.8本章小结11 3 系统的设计与实现12 3.1系统流程图12 3.2MP3爬虫算法13 3.2.1广度优先遍历策略13 3.2.2基于本课题的爬虫算法改进14 3.2.3解析HTML15 3.3MP3标签15 3.3.1MP3标签提取15 3.3.2 MP3标签存储17 3.4文件下载17 3.4.1断点续传17 3.4.2批量下载18 3.4.3文件重命名20 3.4.4下载速度,进度,剩余下载时间的计算21 3.5.ini配置文件22 3.6delegate 和event自定义事件22 3.7本章小结23 4 试验结果分析24 4.1网络爬虫24 4.2查询25 4.3文件下载25 4.4结果分析26 4.5本章小结27 5 总结和展望28 5.1总结28 5.2展望28 致 谢30 参考文献31 摘 要 搜索引擎,作为访问互联网的“网络门户”,是从www上快速而有效地获取信息资源的捷径。而网络爬虫作为搜索引擎的关键技术,它是一个自动提取,分析并过滤网页的程序,为搜索引擎从万维网上下载网页,是搜索引擎的重要组成。文件传输,作为网络应用中最主要的功能,也是互联网中资源共享的基础。下载工具也成为互联网中一种必不可少的工具。一些重要的协议像HTTP,FTP等都支持文件的传送,特别是基于P2P技术的,多任务,多线程,多源,断点续传的下载机制,极大的提高了网络资源的下载速度,最大化了网络资源的共享。 论文首先介绍了课题涉及到的主要理论和技术,在详细分析了爬虫技术的原理和文件下载机制的基础上,针对本课题的应用,改进了爬虫算法。根据所改进的爬虫算法设计并实现了一个MP3下载器,该MP3下载器主要由网络爬虫程序和文件下载2部分组成。网络爬虫实现了在互联网上抓取MP3格式的音乐资源的URL链接及相关信息(歌曲名,艺术家,专辑名等),并将信息以XML形式的数据格式保存在本地,为以后查询下载提供基础。实现了基于HTTP协议的文件下载,并提供了断点续传机制和多任务下载以及文件自动重命名功能。然后,对该MP3下载器进行了测试,测试结果表明,MP3下载器在爬虫抓取MP3信息以及MP3下载上均取得了预期的效果。 论文最后对全文进行了总结,并对今后工作作出了展望。 关键字:搜索引擎,网络爬虫,HTTP,P2P,断点续传 Design and Implement of MP3 Download Abstract Search engine, as a visit to the Internet "portal”, is a shortcut to rapid and effective access to the information resources from the www. Web crawler technology is the key to search engine, it is an automatic extraction, analysis and filtering website procedures for search engine downloaded the webpage from the World Wide Web. File transfer, as the most important network application functions, also is the basis of resources sharing on the Internet. Download tools has become an indispensable tool on the Internet. Some important protocols like HTTP, FTP and so on are major support as the supporting for the transmission of documents, particularly those based on P2P technology, multi-tasking, multi-threaded, multi-source and breakpoint continuingly download mechanism greatly improves the network download speed; maximize the sharing of network resources. This paper first introduces the main theory and technology which related to the Theme, analyzes the principles of the web crawler and the mechanisms for downloading in deeply, improving the web crawler algorithm to satisfy with the application. To design and implement of an MP3 download, according to the improved algorithm of the web crawler,. The Web crawler on the Internet crawls MP3 link resources and related information (title, artist, album, etc.), and also stored the information in the forms of XML in local file, providing a basis for future inquiries and downloading. Implementing a download based on HTTP protocol and providing a mechanism for breakpoint continuingly, multi-tasking download and automatic rename the downloaded file. Then, having a test for the MP3 download; it shows that it achieved expected results. Finally, the researcher would show a review and outlook of the topics. Key Words: Search engine, Web Crawler, HTTP, P2P, Breakpoint Continuingly
经管之家“学道会”小程序
  • 扫码加入“考研学习笔记群”
推荐阅读
经济学相关文章
标签云
经管之家精彩文章推荐