我有下面的代码, 输出中也需要状态和* policy工作流名称(即星期二), 任何人都可以帮忙。
import json
import requests
import pandas as pd
from pandas.io.json import json_normalize
import ast
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
pd.set_option('display.width', 100)
a = [{'attributes': [{'key': '*policy action jobid', 'values': ['289903']}, {'key': '*policy action name', 'values': ['backup']}, {'key': '*policy name', 'values': ['Daily_Backups']}, {'key': '*policy workflow name', 'values': ['tuesday']}, {'key': 'clone retention policy', 'values': [' 504: 5: 34']}, {'key': 'group', 'values': ['tuesday']}, {'key': 'saveset features', 'values': ['CLIENT_SAVETIME']}], 'browseTime': '2020-05-19T23:57:41+08:00', 'clientHostname': 'xyz.com', 'clientId': '7d391c52-00000004-5cda459d-5c1', 'creationTime': '2020-04-28T21:29:25+08:00', 'fileCount': 0, 'id': '1eb1', 'instances': [], 'level': 'Full', 'links': [{'href': 'https://iservera/backups/1ec1', 'rel': 'item'}], 'name': '/abc', 'retentionTime': '2020-05-19T23:57:41+08:00', 'saveTime': '2020-04-28T21:27:07+08:00', 'shortId': '2177', 'size': {'unit': 'Byte', 'value': 0}, 'type': 'File'}, {'attributes': [{'key': '*policy action jobid', 'values': ['2803']}, {'key': '*policy action name', 'values': ['backup: 1589']}, {'key': '*policy name', 'values': ['Daily_Backups: 159']}, {'key': '*policy workflow name', 'values': ['tuesday: 1588079529']}, {'key': '*ss clone retention', 'values': [' 1588079529: 1588079590: 1824409']}, {'key': 'group', 'values': ['tuesday']}, {'key': 'saveset features', 'values': ['CLIENT_SAVETIME']}], 'browseTime': '2020-05-19T23:57:42+08:00', 'clientHostname': 'abc.com', 'clientId': 'ec3dc1', 'completionTime': '2020-04-28T21:29:47+08:00', 'creationTime': '2020-04-28T21:13:10+08:00', 'fileCount': 0, 'id': 'cc1', 'instances': [{'clone': False, 'id': '1588079529', 'status': 'Aborted', 'volumeIds': ['245614341']}], 'level': 'Full', 'links': [{'href': 'https://abc/backups/c771', 'rel': 'item'}], 'name': '/xyz', 'retentionTime': '2020-05-19T23:57:42+08:00', 'saveTime': '2020-04-28T21:10:53+08:00', 'shortId': '2141727718', 'size': {'unit': 'Byte', 'value': 36264099844}, 'type': 'NDMP'}]
df = json_normalize(a)
a = df[['clientHostname', 'completionTime', 'size.value', 'type', 'fileCount']]
print(a)```
Current output is-
``` clientHostname completionTime size.value type fileCount
0 xyz.com NaN 0 File 0
1 abc.com 2020-04-28T21:29:47+08:00 36264099844 NDMP 0```
Output needed is-
```clientHostname completionTime size.value type fileCount status Policy
0 xyz.com NaN 0 File 0 - tuesday
1 abc.com 2020-04-28T21:29:47+08:00 36264099844 NDMP 0 Aborted tuesday```
#1
你可以获取策略工作流名称为df [‘attributes’] [0] [3] [‘key’], df [‘attributes’] [0] [3] [‘values’]和状态为df [‘instances ‘] [i] [0] [‘status’], 其中i是记录号。
#2
我将使用库jmespath遍历json数据:
要访问密钥, 请使用。要访问列表, 请使用[]表示法。
import jmespath
expression = jmespath.compile("""
[].
{clientHostname:clientHostname, completionTime:completionTime, "size.value":size.value, type:type, fileCount:fileCount, status:instances[].status, Policy:attributes[?key==`*policy workflow name`].values[]}
""")
res = expression.search(a)
res
[{'clientHostname': 'xyz.com', 'completionTime': None, 'size.value': 0, 'type': 'File', 'fileCount': 0, 'status': [], 'Policy': ['tuesday']}, {'clientHostname': 'abc.com', 'completionTime': '2020-04-28T21:29:47+08:00', 'size.value': 36264099844, 'type': 'NDMP', 'fileCount': 0, 'status': ['Aborted'], 'Policy': ['tuesday: 1588079529']}]
进行一些清理以适合你的用例:
#get the day of the week for policy keys
res = [{key:value[0].split(':')[0]
if key=="Policy" else value
for key, value in ent.items()} for ent in res]
#if list is empty, replace it with "-"
res = [{key:"-"
if (isinstance(value, list) and not value) else value
for key, value in ent.items()} for ent in res]
pd.DataFrame(res)
clientHostname completionTime size.value type fileCount status Policy
0 xyz.com None 0 File 0 - tuesday
1 abc.com 2020-04-28T21:29:47+08:00 36264099844 NDMP 0 [Aborted] tuesday
你可能需要查看数据, 看看是否需要进行其他转换
评论前必须登录!
注册