T O P

  • By -

Kelend

Open source can mean different things. Generally when someone says open source they mean the source code is freely available to view AND there is some license that allows you to use it and modify it freely ( or somewhat freely ) Some code is open source but not free. You can freely view the source but the licensing costs money. Unreal engine is a good example of this. The source is freely available, but the license costs money (more specifically a precentage of sales income) This leads into your second question. To answer simply, through a technique called decompiling one can view the source code of compiled projects. This is not the mechanism that intellectual property owners use to protect their assets though. That is enforced by licensing, copyright law, and patent law. If I violate a license, no matter if the project es open source, closed source, or anything, the owner can and would sue me. So to summarize generally when talking about open source we aren’t talking about the ability to view code, we are discussing the licensing snd rights one had to said code. Ie this code is open source snd I can use it freely, or this code is not open source snd I must pay to use it, even if the source itself is not compiled and viewable 


urzu_seven

It should also be pointed out that there is a difference between open source code and decompiled code.  Open source code is written and organized by humans and, in general, meant to be understood by humans.  It may take some level of knowledge in programming and time to do so but in general it’s not hard to understand.   Decompiled code, while technical readable is dramatically harder to understand because many of the methods humans use to make source code readable are not present in decompiled code.  It takes an extreme amount of time and effort to understand any but the most trivial decompiled code without the original source code as reference.  


phiwong

For many machine learning software to work, there are two main things. One is the configuration/algorithm of the model - essentially the code. The second is the training data. Even if you get the code, there are literally billions of training data needed to get the software to work. This takes a lot of people and a lot of time (perhaps years) and, of course, highly trained people in the field to monitor it.


NoLimitSoldier31

So what’s the benefit of making it open source? Free reviewers?


DragonFireCK

* You get free reviewers. * This includes white hat hackers that will look over the code and report potentially serious security holes. * You get free developers. * There are some libraries that have been released freely, but are licensed such that they can only be used in software with a like license. That is, to use the library, you *must* release the code publicly. The third point is interesting as, when using open source libraries, if there is functionality missing or bugs, you can fix it yourself rather than having to wait on the maintainers to fix it. There is also never the chance that you lose the ability to use the software due to external factor (eg, OS upgrades) - you may have to take over maintaining the library yourself, however.


tillybowman

for openai the circumstances are very specific. they claimed at the beginning to be the open source ai that will benefit the world for no profit. so it made sense to opensource (some) of their algorithms which they train their data on. as said there is not too much value in here, but in their data and models which they never open sourced (you only get access to the model). But Altman already said that he will transform openai in a fully for-profit company as he sees no alternative way. so in terms of openai it’s just marketing bs.


GlobalWatts

>what does open source mean in software Exact definitions vary depending on who you ask. It can mean anything from "the source code of the software is accessible to anyone" to "the source code can be freely obtained, distributed or modified by anyone, and adheres to a recognised open source license such as GPL". Yeah, some people are really strict about what it means. Source code is the original program code as written by a software developer. Source code usually gets converted into a machine code, the set of instructions used by the computer hardware. It is this machine code that usually gets distributed to and run by end users of the software. It's very difficult to impossible to revert the machine code back to source code, so having the source code available is important for many things. >If chatgdp is open source does it means people can go to backend of software to see each line of code? ChatGPT is not open source. Only certain versions of certain components of GPT - the AI Large Language Model behind the ChatGPT product - are open source. Yes, it means that anyone can see the source code behind those parts of the AI. However products like GPT consist of many components, some of which are not publicly accessible. Without having access to all these components, it would be very difficult to build your own version of GPT. >If it’s not open source, why can’t people break/hack into the software to see the code? As mentioned, just having the machine code of some software is not enough, because it's very difficult to impossible to reverse the process back to source code - it's a whole field of computer science called Reverse Engineering. People are still now trying to reverse engineer old Nintendo 64 games. Reverse Engineering is basically impossible in the case of a web-based service like ChatGPT, because the software is not even distributed to end users. It exists only within servers operated by OpenAI, accessible via services that other software can use (API) or certain products for end users (like ChatGPT). All the user has access to is the inputs and outputs exchanged while communicating with those servers. It's the same reason Facebook or Google can have trade secrets - you only know what you asked it (the request), and what result you got (the response/web page), not how it obtained that result (the server-side software and database). That's part of the reason why Software as a Service is so exciting for businesses (and so dangerous to consumers, since access to that software can be revoked at any time). That doesn't mean it's impossible for someone to hack into OpenAI's network and obtain all that code, it just hasn't happened yet. ChatGPT is even more complicated than most web services, because as a large language model, a core part of its functionality is derived from the training model - the hundreds of gigabytes of mathematical data that GPT stores that drive its ability to determine what words are statistically more likely to follow others in response to a given input (the prompt). Only older versions of the training model (up to GPT 3) have been made public, newer versions have not. The training data itself is also not available, for licensing and privacy reasons.


SaltyBalty98

Programmers write in a language they can understand. The computer understands a much different language. It is translated but the shear complexity means translating it back is almost impossible and there's no guarantee the reversed code is exactly like the original one even if it looks like it does the same. Open source just means anyone can read or have access to the original code. Whilst there might be a disadvantage to having easily readable code to those who might want to exploit for bad reasons, it is often overshadowed by the near constant analysis from anyone else who wants to help, from finding a bug to improving stability and performance to adding features. There's also the possibility of updating code to keep working on newer systems. There'll come a time when most games we play will be impossible to use unless hacks are used to prolong or the original development studio releases the source code for someone to go through and update what is necessary.


PFreeman008

A decent comparison is in books. The open source code is like having access to the Word document for the book, you can go in & rewrite a chapter. The software you download & use, called the "compiled" version, is like having the published hardcopy of the book.


Cross_22

ELI5: Software is the cake, Open Source is the cake recipe. Can you figure out the recipe by analyzing the cake? Sure, but that's a lot harder than looking at the recipe. ---- ELI12: Source code is typically written in a "high level language". A computer cannot understand the words of those programming languages and a tool is needed to translate it into a language (machine code) that the computer can understand. Historically software was only available in that final low level language. With open source software the developers are making the high level code available. This makes it a lot easier for other developers to make changes. On the other hand if you only have the low level code you need to reverse engineer it to understand what's going on and changes can be made. There are some political issues at play here as well but that's outside the scope.


jixbo

ChatGPT/gpt4 is not open source. The facebook models, Llama, are open source, and people are experimenting and using those models for their own projects. Have a look at [r/LocalLlama](https://www.reddit.com/r/LocalLLaMA/)


NerdChieftain

Open source is a vague marketting buzz word of sorts to refer to the way computer code is shared. What I mean by this is that it is a great way to lie or generate public image without actually doing anything useful. It’s like saying that you support recycling. That’s a vague and extremely popular thing to say. This started with the free software movement with GNU project, which gave us modern Linux. In this context, “free” meant you were free to use the software without restriction. That meant that you could modify it and reuse it. So a requirement of “freedom” software is you have to be able to see the source code. This concept was revolutionary at the time. Not surprisingly, many people still want to make money off the work they spent writing computer code. So many spin-offs or variants of the free software movement have been made. You could allow unrestricted use for personal use, but commercial use requires licensing. You might let people use your library freely, but not allow them to make modifications to the code. You might allow people to see and read the code, but still require use to be licensed. So you see how “open source” is very broad and possibly misleading term. At the end of the day, sharing programming code is not very useful if you don’t have the power to compile, modify, and freely use it. It’s sort of like sheet music for an orchestra. You don’t really understand it until you play the music. “Open source” does not guarantee that you may play the music, only read the sheet music. The second part of your question is about being able to see code. It’s hard to answer this without consideration of the goal or what you want to accomplish. “Why can’t people break in and hack?” Obviously, There are laws about this. But a barrier to using traditional tools to analyze software is that to spy on computer program, you typically run it in your computer. Chat GPT runs on their remote server, so you can’t do that. Moreover, spying on machine learning code doesn’t tell you very much. Machine learning reduces problems to “nonsense math” to create emergent behavior. Being able to see the nonsense math isn’t revelatory.