johnny li博客

构建只为纯粹书写的博客

抓取qq音乐原始播放地址

johnny python

文中用到的几个词: songid 在分享链接时出现在链接中 vkey 拿到音乐地址所必须的参数 songmid 获取vkey的音乐id filename 由songid拼接而成,C100{songmid}.m4a 最终音乐地址 http://dl.stream.qqmusic.qq.com/{filename}?vkey={vkey}&fromtag=66

1 . 点击分享链接

在音乐界面点击复制链接 得到原始URL https://i.y.qq.com/v8/playsong.html?songid=212877966&source=yqq#wechat_redirect 得到songid=212877966

如果在排行榜页面点击歌曲名称,例如https://y.qq.com/n/yqq/song/002Fjdnk0zz1T6.html,那么songmid就是002Fjdnk0zz1T6,可直接到第四步

2 . 请求qq音乐接口

根据songid请求jsonp请求 https://c.y.qq.com/v8/fcg-bin/fcg_play_single_song.fcg?songid=212891616&tpl=yqq_song_detail&format=jsonp&callback=getOneSongInfoCallback 去掉无用的数据如下:

getOneSongInfoCallback({
    "code": 0,
    "data": [{
        ...
        "mid": "0018jCvj3i0f3E",
        ...     
        "url": "http://stream3.qqmusic.qq.com/224891616.wma"
        
    }],
    "url": {
        "212891616": "ws.stream.qqmusic.qq.com/C1000018jCvj3i0f3E.m4a?fromtag=38"
    },
    ...
})

3 . 拿到最终地址

04.19发现地址已经被屏蔽403,继续步骤4 获取mid,拼装URL http://thirdparty.gtimg.com/C100{mid}.m4a?fromtag={random} 得到最终音乐文件地址 此方法已不能用,提示未授权。

改为直接读取步骤2中的json url: ws.stream.qqmusic.qq.com/C1000018jCvj3i0f3E.m4a?fromtag=38


4. 请求音乐文件的vkey

在播放页面抓取js得到如下地址: https://c.y.qq.com/base/fcgi-bin/fcg_music_express_mobile3.fcg?g_tk=872989112&jsonpCallback=MusicJsonCallback06459212607938936&hostUin=0&format=json&inCharset=utf8&outCharset=utf-8&notice=0&platform=yqq&needNewCode=0&cid=205361747&callback=MusicJsonCallback06459212607938936&uin=11297258&songmid=0018jCvj3i0f3E&filename=C1000018jCvj3i0f3E.m4a&guid=9136027940

链接有些长,应该有无效的参数,暂未验证哪些参数是必须传的

需要替换里面的songmidfilename 其中参数songmid就是步骤2中的mid(ps: 里面有好几个mid,最外层的那个,filename=*C100{songmid}.m4a*)

之后后请求,得到json,拿到vkey:

{
    "code": 0,  
    "data": {
        "expiration": 80400,
        "items": [
            {
                "subcode": 0,
                "songmid": "0018jCvj3i0f3E",
                "filename": "C1000018jCvj3i0f3E.m4a",
                "vkey": "734C6EF27F122751D9D098AC84F2465F8BA2658A4D4EC9C6A084FCDC10C68718A5C72869CF3B424EC4EE8CCF96FEFDD4E2B2F7F9E6790E73"
            }
        ]
    }
}

里面的expiration猜测是vkey的过期时间

5. 拼接最终地址

用步骤4里面的数据filename、vkey拼接下面地址: >http://dl.stream.qqmusic.qq.com/**{filename}**?vkey=**{vkey}**&fromtag=66

6. 批量下载qq音乐脚本

#!/usr/bin/python
# -*- coding: UTF-8 -*-

import requests,json,re
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

from multiprocessing import Pool

def download_music(songmid,music_name):
    url='https://c.y.qq.com/base/fcgi-bin/fcg_music_express_mobile3.fcg?g_tk=872989112&jsonpCallback=MusicJsonCallback06459212607938936&loginUin=11297258&hostUin=0&format=json&inCharset=utf8&outCharset=utf-8&notice=0&platform=yqq&needNewCode=0&cid=205361747&callback=MusicJsonCallback06459212607938936&uin=11297258&songmid={0}&filename=C100{0}.m4a&guid=9136027940'.format(songmid)
    html= requests.get(url)
    # 去掉jsonp
    music_json= json.loads(re.findall(r'^\w+\((.*)\)$',html.text)[0])
    filename=music_json['data']['items'][0]['filename']
    vkey=music_json['data']['items'][0]['vkey']
    download_url='http://dl.stream.qqmusic.qq.com/{}?vkey={}&fromtag=66'.format(filename,vkey)
    print(download_url)
    #下载到本地
    music=requests.get(download_url)
    #文件名去除特殊符号
    with open("d:\\music\\{}.m4a".format( re.sub(r'[\s+|@<>:\\"/]','',music_name)),"wb") as m:
         m.write(music.content)

def view_html():
    # qq音乐页面是js加载的,这里用chrome headless模式访问
    option = webdriver.ChromeOptions()
    option.add_argument('headless')
    driver = webdriver.Chrome('D:\Program Files\Python\chromedriver.exe',chrome_options=option)

    #排行榜页面
    driver.get('https://y.qq.com/n/yqq/toplist/26.html')
    print(driver.title)
    try:
        # 等待播放列表加载完毕
        WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CLASS_NAME, "songlist__songname_txt")))

        lis= driver.find_elements_by_class_name('songlist__songname_txt')  
        pattern = re.compile(r'https://y.qq.com/n/yqq/song/(\S+).html')    
        for i in range(lis.__len__()):
            li = lis.__getitem__(i)
            a= li.find_element_by_class_name('js_song')
            # 获得songid
            href=a.get_attribute('href') 
            music_name=a.get_attribute('title')          
            m=pattern.match(href)            
            download_music(m.group(1),music_name)
       
    finally:
        driver.quit()
 

if __name__ == '__main__':
    view_html()

效果图如下: 抓取的页面

运行的代码

保存的音乐

johnny
构建只为纯粹书写的博客