[ad_1]
Companies that allow users to request their personal data are required to comply with the aforementioned GDPR regulation. However, there is a catch: The file format may make the data unreadable for most of the population. In this case we got both html
and json
files. but html
can be read directly json
The files can be more difficult to interpret. I personally believe that the new regulation should also implement a readable data format. But for now…
Let’s explore the files one by one to get the most out of this new feature!
The first file is chat.html
which contains all my chat history with ChatGPT. Conversations are saved under their respective titles. User questions and ChatGPT answers are marked as assistant
and user
Accordingly.
If you’ve ever trained an AI model, this labeling system will be familiar to you.
Let’s look at a sample conversation from my story:
Have you ever seen the thumbs up, up down (👍👎) next to any ChatGPT reply?
This information is treated by ChatGPT as feedback for the given responsewhich will then help you train the chatbot.
This information is stored message_feedback.json
A file containing any feedback you have provided to ChatGPT using the thumbs icons. Information is stored in the following format:
["message_id": <MESSAGE ID>, "conversation_id": <CONVERSATION ID>, "user_id": <USER ID>, "rating": "thumbsDown", "content": "\"tags\": [\"not-helpful\"]"]
The thumbsDown
The rating takes into account incorrectly generated responses, while thumbsUp
Reports generated correctly.
There is also a file (user.json
) contains the following personal data of the user:
false], "phone_number": <USER PONE>
Some platforms are known for creating user models based on platform usage. For example, if Google User search is mainly about programming, Google It is likely that the user is a programmer and uses this information to display personalized ads.
ChatGPT can do the same with information obtained from conversations, but they are currently required to include this inferred information in exported data..
⚠️ FYI, One can access What Google knows about them begining Gmail by clicking account >> Data and Privacy >> Personalized ads >> My advertising center.
There is another file that contains conversation history and also contains metadata. This file is named conversations.json
and Includes information such as creation time, several identifiers, and the model behind ChatGPT, among others.
⚠️ Metadata provides information about the underlying data. It may contain information such as the origin of the data, its meaning, location, ownership and creation. Metadata includes information related to, but not part of, the underlying data.
Let’s explore the same conversation A320 hydraulic system failure is revealed in this first example json
format. The conversation itself consists of the following questions and answers:
From this simple conversation, OpenAI stores quite a bit of information. Let’s take a look at the saved information:
- Main areas
json
The file contains the following information:
Valley moderation_results
empty since then In this particular case, ChatGPT did not provide feedback. Besides, [+]
in the symbol mapping
A field means more information is available.
- in fact
mapping
The field contains all the information about the conversation itself. Since a conversation has four interactions, Routing saves onechildren
Login per interaction.
again, [+]
A symbol indicates that more information is available. Let’s take a look at the different entries!
mapping_id
: contains aid
For conversation, as well as information about the time of creation and the type of content, among others. As far as we can tell, it also creates aparent_id
for conversation etcchildren_id
which corresponds to the user’s next interaction with ChatGPT. Here’s an example:
children_idX
: Newchildren
A record is created for each interaction from a user or assistant. Since conversation has four interactions,json
The file shows fourchildren
records. each onechildren
The record has the following structure:
Პirveli children
The recording is embedded within the conversation mapping_id
As a parent and other interaction – Reply from ChatGP – as a second child.
Children
which corresponds to the ChatGPT response contains additional fields. For example, for the second interaction:
In the case of a ChatGPT response, We get information about the ChatGPT model and stop words. It also shows the first children
as it parent
and third children
As the following interaction.
The full file can be found at this GitHub.
Have you ever used the “regenerate response” button when you’re not entirely sure of the response ChatGPT provided?
This feedback information is also saved!
is the last file named model_comparisons.json
that Contains snippets of conversations and follow-up attempts any time ChatGPT has updated a response. The information contains only text without a title, but contains other metadata. Here is the basic structure of this file:
"id":"<id>",
"user_id":"<user_id>",
"input":[+],
"output":[+],
"metadata":[+],
"create_time": "<time>"
The metadata
The field contains some important information such as the country and continent where the conversation took place and information about it https
Access scheme, among others. Here comes the interesting part of this file input
/output
Records:
input
The input
Contains a collection of messages from the original conversation. Interactions are labeled according to the author And, as in the previous cases, some additional information is also stored. Let’s take a look at the messages saved for our sample conversation:
User
/Assistant
Entries are pending, but I’m sure we’re all wondering at this point Why is A system
label?
and moreover Why do they make such an initial statement at the beginning of every conversation?
Does ChatGPT pre-feed the current date in any new chat?
Yes, These records are so-called system messages.
System messages
System messages provide general instructions to the assistant. They help determine the behavior of the assistant. In the web interface, system messages are transparent to the user, so we cannot see them directly.
The advantage of the system message is that it allows the developer to set up the assistant without the request itself becoming part of the conversation.. System messages can be delivered using the API. For example, if you are building a car sales assistant, one possible system message might be “You are a car sales assistant. Use a friendly tone and ask customers questions until you understand their needs. Then explain the available cars that match their preferences”. You can provide a list of cars, specifications and prices so that the assistant can provide this information as well.
[ad_2]
Source link