Last updated on 2026-02-13 | Edit this page

Session 3

Required Self-Study

? ellmer installed and working? chat-check, see material !
ellmer in action:
- create chat, put a prompt
- inspect the chat object
  - discussion of tokens, cost, response time, etc.
? What is a major problem, when using ellmer?
- Advantages of “aggregated calls”
- Problem of token limit for large datasets .. doesnt fit into one prompt!
- idea of batch processing
GROUP:
- ! identify at least 4 reasonable use cases for AI-calls via ellmer!
- ? what kind of tasks are not suited for AI-calls or can be done otherwise?
- ? what are drawbacks/cons of AI-usage in data processing?
EXERCISE: pick one use case and set as a task
? Thoughts about reproducability and reliability?
? How to guide the reliability of an LLM?
- discussion of
  - temperature parameter
  - top p = top tokens with cumulative probability up to p (lower = more restrictive)
  - top k = k highest tokens (lower = more restrictive) (not always supported)
? and energy consumption?!?
? AI vs. human in data processing… whom to trust? when and why?
Best practice discussion/summary

? why care about reproducability?
discussion of pseudo randomness and seeding in computers
- general problem of randomness in data processing
GROUP: local LLMs vs. API-based LLMs
- ? what are pros and cons of local vs. API-based LLMs?
- ? how to control reproducability in both cases?
- ? what seems most appropriate for research?
? How to document LLM usage in data processing?
Best practice discussion/summary