Show HN: I fine-tuned GPT4.1 on my iMessage history

(jonyork.net)

10 points | by jonpizza 5 days ago ago

6 comments

  • felix089 4 days ago ago

    How did you structure the dataset for FT? Reminds me of: https://rosslazer.com/posts/fine-tuning/

    • jonpizza 4 days ago ago

      I chunked my conversations by day so that each conversation in the dataset would be about the same topic throughout without random switching, which isn't perfect, ideally I would let an LLM chunk the conversations into logical start/stop points, but I didn't want to spend all that money on tokens. I also got rid of any conversations with images and group chat conversations to simplify.

  • ungreased0675 4 days ago ago

    Have you tried talking to yourself with this? Were there any unexpected insights?

    • jonpizza 4 days ago ago

      I asked "myself" what my greatest fear was and it actually gave me an accurate answer. Then I asked it again and it said "Clowns". I don't think it was particuarly insightful, but it was slightly eeire. The tone and style are kinda spot on tbh, though the content is generally incorrect.

  • bn-l 5 days ago ago

    I tried it out. Were many of the texts about hooking up and “cuties”?

    • jonpizza 4 days ago ago

      Not really. There are some, but I do think generally my conversations kinda biased the model over towards adjacent topics. I also included something in the system message along those lines ("Hi, I have a cute girl I want you to meet") for a sample response to a basic input like "Hi" just to make the conversation more interesting.