Secure confidential data with AnythingLLM

This image has an empty alt attribute. The file name is itunes_badge_md-300x110.png

Introduction

Nice to have you with us again for another episode of Digital 4 Productivity. And I’m often asked: What if I want to use AI now, but I have highly confidential information, like personal information, like really company secrets? How can I make sure that this is not reported back to OpenAI, the provider of ChatGPT, or to Google, the provider of NotebookLM and Gemini, for example, that my information is then available to competitors? So there have already been one or two cases where information from other information was suddenly available in AI models because the models like to be trained with it.

Background – Data security on social networks

Yes, there are several points. Firstly, the social media platforms, such as LinkedIn and ex-Twitter “X”, are currently all changing their terms and conditions to some extent and they are doing this so sacredly, silently and quietly. And the default is becoming more and more that you agree that your data, which you post publicly on social media, is also used to train the AI models. You should always make sure that you use the opt-out function, that you go into Facebook, LinkedIn, Twitter, i.e. “X”, and always check: Okay, where can I opt out again if necessary?

And we are also happy to include the links in the show notes, which you can of course always get at digital4productivity. De. Or you can send an e-mail to t. Jekel@jekelteam. De and I’ll be happy to send you the relevant links. So you should pay a little attention here.

Microsoft 365 – Data backup in CoPilot

The next issue is that if you use Microsoft 365, you have the option of booking Copilot as an extra license. And here you have another assurance from Microsoft that this data will remain in your tenant. Then the question is: Do you trust Microsoft? Now there is also the next option, where you say that you work with services that allow you to work offline with AI tools locally, in extreme cases even without an internet connection. The good news is that there are many AI models of these large language models. They are also available as open source, so you can even download them free of charge and use them offline. And there are different systems here. In the next episode, I’ll be showing you a very exciting interview with the team from SchneiderAI. It’s a really exciting start-up in Munich that is working on saying: How can I use a German cloud in one way or another, but even go as far as providing a fully configured PC as a provider, the so-called AI box. And that’s the idea behind SchneiderAI. We’ll take a look at that in one of the next episodes.

The idea behind AnythingLLM

In this episode I would like to tell you about AnythingLLM. You can find it at anythingllm. Com. As always, the link is also included in the show notes. And the idea is, you can download a software there and that has several advantages over other comparable solutions. I must have already tested 20-30 of these solutions. You know, I don’t let the non-swimmer explain to me how I can swim faster. And with this in mind, I always try out all the solutions before I recommend them to you. And you can save yourself all the failed attempts I’ve made. I’m practically the digital shortcut. The digital truffle pig, I always say, or the digital travel guide. So, the idea is that you have the option of using this tool to install it offline. First of all, there are a lot of these tools that I’ve had a hard time installing. You have to use terminal commands and work with some GitHub repositories. The developers among you, the software developers, have a tired smile on their faces.

But even I, who programmed assembler on the C64 as a child and who already had a certain understanding of programming … I learned Cobol and Fortran at some point during my studies. We still had to learn that back then. So I did some programming myself. I can do it. I’m a bit out of practice in terms of logic, but I’m an advanced user in principle. And even I sometimes found it difficult to do all these things at first glance. And then there’s always the question for me: what can I break if I make a mistake in the depths of the terminal commands? The good thing is that with AnythingLLM you have a simple download routine, just like you would with any other program. This means that you download a program and you can download it for Windows, Linux and also for the Mac operating system. It’s also interesting to note that many of these local AI tools are currently only available for Macs. This is partly due to the fact that the Apple Silicon generation from the M1 processor onwards has a very large AI part in its processor and is well suited for this in terms of architecture, and Windows is only now slowly catching up in terms of hardware with these new arm systems.

Language models from Microsoft and Apple

So this arm strategy change that Windows has just made with the new Surface and with competitor devices from other providers such as Dell or Fujitsu or others, or HP or Compact. Apple did that four years ago. Microsoft is currently going through this process. That’s why Apple is a small step ahead for once, but the good thing is that you get AnythingLLM for all platforms, including those of you who work with Windows systems, who can also work with this tool.

You don’t even need such an extremely monster PC for this and can then download the software accordingly and in the next step you can then build different instances or workstations with it. I think that’s a pretty cool solution, because you have the huge advantage that you can set up several of them in this area, I’ll open it up for you, then I’ll check again that I’m using the right term here, and they’re called workspaces. Workspaces are practically like your own ChatGPT chat, which you have there accordingly. And not just chat, but like your own chat GPTs and the great thing is that you can store a different language model and behavior for each workspace.

My recommendation – set up your own workspaces with and without the Internet

For example, the minimum I recommend is that you say you have one workspace without access to the Internet and one with access to the Internet. Because sometimes the system hallucinates a bit more when you have Internet sources with you. Sometimes you only want to work on your own sources. And then you can set which model you want to work with for each workspace. And here, for example, there are the Anthropic models from Claude, the current Mistral models from our French colleagues, which are very, very powerful. And the great thing is that they are open source, so you can download them offline for use once you have downloaded them. You have to wait a little patiently, in other words, have a little patience. The first time I was too impatient and thought to myself, there’s always an error message, what’s the point? Until I went to the model area and realized that the model somehow has 30 GB. It takes him a while to download and install it. By the way, you don’t always have to take the biggest model. So size is not always better. Sometimes the question is just how smart is such a model?

So give it a try. The ones I can always recommend are the ones from Anthropic, the Claude models. You can also use the LLaMA models from Facebook Meta. These are the most common ones. And you can simply try them out. And you can then always say which temperature for each working range. Temperature zero tends to mean that if there are five probable variants, you only ever get one. And if they have the highest temperature, which I think goes up to two, then they tend to get all possible variants, in other words, in simplified terms. That’s temperature zero. It’s more fact-oriented and temperature two is more imaginative. That’s a bit too short-sighted, because those who know me know that it’s more a question of saying how likely a solution is, because that’s just a, AI is an AI, it’s a probability machine. And at a temperature of zero, you only ever get the one most probable answer, and at a high temperature, you don’t just get the most probable answer, but other answers as well. So that’s more the idea behind this topic. Perhaps that’s the background to the temperature.

Test different language models – a methodical approach pays off!

The exciting thing is that before you get started, you can always upload documents per workspace and can upload documents… can also upload websites and can ultimately do almost everything that you are used to from the large models. Of course, they always have slightly less performance, because the large data centers are equipped with the fat Nvidia graphics cards that you don’t usually have at home. They have glass-frame-thick glass fiber cables to the Internet. This means that they will never have the performance that they have with a computing-based, computing-based solution, but they can also do things in-house, especially when it comes to confidential matters. And the current computers should have good performance and plenty of memory. And with anythingLLM, for example, you can even say: I have a computer here that is available to several people on the network. So from that point of view, I can recommend simply trying it out, going into the settings. And you can simply test it, you can even disconnect the Internet and still have the option of working offline with these language models. Because often it’s not about the content, it’s about the methodology, namely the processing of language, i.e. merging content from different documents, rewriting, recombining, that can be done wonderfully.

And if you are now doing things for the website that are publicly available anyway, then they are happy to work online again. So for me, it’s not a replacement, but a supplement for tools. As you know, some people say that the cloud is called the cloud because it steals data. I don’t believe that. I think the cloud even has a higher level of data security in many areas, because if I look at how well my servers here are protected against access, access, fire, water and other dangers, then I think an underground data center from Microsoft or DATEV is much better protected. On the other hand, I also say, okay, passwords, for example, also belong in the cloud for me. I tend to manage them offline and then synchronize them securely between the devices there. And it’s similar to the topic of AI. AnythingLLM is a good addition that you can run on all platforms. Just give it a try and look forward to one of the next episodes when I show you Schneider AI there and the interview, which, as I said, is very, very exciting.

Use AI strategically

Incidentally, if you have the topic of artificial intelligence on your agenda at the company and you say: “Gosh, I need someone who can take a strategic view of the world of management from their own experience as a managing director, a medium-sized company or as an Executive MBA, but who is also familiar with the depths of the tools and can therefore also connect with my IT experts there. If you need someone there, I would be very, very happy to help. What I like to offer you, what I am currently doing for many customers, are two-day in-house AI workshops, where on the first day we get up to speed on the current state of AI, where at the end of the first day you have understood what AI can already do today. And then, as I learned, I always did this in one day at the beginning and collected use cases at the end of the day. That worked wonderfully, but then we didn’t get to the implementation stage. Then we always had to add a second day and that’s why I now only offer it in a two-day format, where we use the evening to simply have philosophical, ethical discussions, where we get to know each other a little better over a beer or, of course, over a non-alcoholic drink and where the next day we simply go into this topic, hands-on, and at the end of the second day they really have the first prototypes that they can continue to work with.

Conclusion

Because you know that above all, my goal is not to be the hero myself, but to make you the hero so that you can use your AI in your company, sell more, become more productive and even open up new business areas. That is my goal. I sincerely wish you every success and will be happy to support you if required. Yours, Thorsten Jeckel. This was another episode of Digital for Productivity, the podcast for productive digitalization. And always remember: Switch on your brain first, then your technology.

Yours, Thorsten Jekel.

Also available in: Deutsch

Antworten

Your email address will not be published. Required fields are marked

{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}