head	1.3;
access;
symbols;
locks; strict;
comment	@# @;


1.3
date	95.11.12.00.05.52;	author chris;	state Exp;
branches;
next	1.2;

1.2
date	95.10.01.20.53.59;	author chris;	state Exp;
branches;
next	1.1;

1.1
date	95.10.01.18.53.17;	author chris;	state Exp;
branches;
next	;


desc
@README for Newsflash
@


1.3
log
@*** empty log message ***
@
text
@This program is (C) copyright 1995 by Christian Blum.  It is published
under the terms of the GNU general public license.

Newsflash retrieves news articles via NNTP from one server and
delivers them to another.  It requires at least read-only permissions
on the remote server, and needs to have peer permissions on the local
server.  It works well with INN, but should also work with any other
RFC977 compliant news software.  Newsflash's highly parallel design
is optimized for throughput, which makes quite a difference to INN's
nntpget and the like.


How to compile
==============

Check config.h.  Set BASENAME to whatever directory you'd like to put
newsflash's config files in.  If the receiving server is not
"localhost" or its port is not 119, set LOCALHOST and LOCALPORT
appropriately.

Have a short glance at the Makefile; it should be allright for most
systems.

Now type 'make dep' and 'make'.


How to install
==============

ATTENTION!  The "serverlist" file format has changed since version 0.97!

Make sure you're allowed to create files in the BINPATH directory set
in the makefile; do 'su' or 'su news' if you have to.  Then type 'make
install'.  Now create the lib directory given in config.h (BASENAME).
Create a file "serverlist" in it.  Have it contain one or more server
entries like this:

server-nickname   server-fqdn   port   grouplist   timestamp

where "server-nickname" is some random string to be prepended to any
output related to this spool job, "server-fqdn" is the name or IP of a
news server, "port" is the port it listens on (usually 119 - ignored
for now), "grouplist" ist the absolute pathname (!) of a file
containing newsgroup patterns (see below), and "timestamp" is the time
this server has been checked last (seconds since Jan 1st, 1970,
midnight, GMT), or 0 if it has never been checked before.  Lines
starting with whitespace or '#' are ignored.  Though you can safely
set the timestamp to zero it is probably not a good idea to do so,
because it will make your first spool from this server very slow and
possibly pretty bulky, depending on the remote server's news spool
size.  Use the included printdate program to get the current time in
seconds since the epoch.  If you subtract a few days (or multiples of
86400), you get a good initial value.  Make sure the serverlist file
is writable when newsflash is started (eg. set the file owner to news,
permissions to 644, and make newsflash setuid news).

Now you need to create the group file(s) referenced to in the
serverlist; it/they must be readable for newsflash, but need not be
writable.  Have them contain lines like

misc.misc
comp.os.linux.*
alt.sex*,!alt.sex.watersports,!alt.sex.fetish.*

and the like.  Note that you _can_ use lists of wildmat expressions,
and you _should_ wherever possible, because this speeds up the NEWNEWS
command on the remote host, but let the first line or two be single
groups; this makes newsflash start the transfer process more quickly.
Whitespace terminates lines; everything behind a blank or tab
character is treated as a comment, so you can use INN's active file
directly if you like.  'junk' and 'control' are ignored if you haven't
disabled this feature in the config.h file, but these groups are
processed if they match any of the wildmat lists instead of a single
line.  See RFC977 for details on the NEWNEWS command, and the wildmat
man page on wildmat expressions.  See the 'newsfeeds' man page for
further description of newsgroup patterns.

Make sure your local news server carries all the groups you're
interested in, or retrieved articles will end in junk or /dev/null.
INN: don't edit the active file directly if you don't have to; use
ctlinnd instead (see manpage).


Now go for it!
==============

Fire up your IP connection if you have to, then run 'newsflash -v'.
If it does not seem to do anything useful, you're probably running it
on SunOS (or something similarly bozotic :-).  Some SunOS's libraries
appear to have a bug which prevents stdio'ish read requests to sockets
from working; re-compile newsflash with -DSCUMOS added to the OFLAGS
list in the Makefile ('make clean dep all').  Using this parameter
will probably slow down newsflash a slight little bit, so don't use it
if you don't have to, because we want it to be the fastest news
retriever in the west, right? :-)

You should now get some output describing what newsflash is doing.
Note that it is somewhat 'interleaved'; this is a feature, not a bug.
Newsflash uses several processes and several connections in parallel
to speed up the lookup and retrieve process.  While newsflash
retrieves articles for one line in the newsgroup list, it already
probes the remote host for the next one, so that the data connection
will hopefully be running continuously, thus increasing its
throughput.

It is a good idea to use more than one line in the serverlist, even if
they all point to the same server, because they will not be processed
sequentially, but in parallel.  Be sure to use different group list
files with each line though, so that different connections are probing
for different newsgroups (first).  Sorting the lines in another order is
what you should do if you're harvesting articles on several servers;
if all lines refer to the same server, make the group lists disjunct
for best efficiency.


If you set up newsflash with all these hints in mind, it should run
considerably faster than any other non-batching news retriever.  The
backdraw is that it also imposes a heavier load on the remote server,
but that's likely to be somebody else's problem. :-)

Once everything is properly set up, you might want to start newsflash
without the -v parameter to make it less wordy.  Try 'newsflash -h'
for a list of other useful parameters.


Known bugs
==========

As I've said, it's still beta... or even alpha, how do you know? :-)

Don't over-interpret newsflash's output in verbose mode; it is still
in the beta state, and the texts are partly misleading.

Newsflash does not contain any locking mechanism for access to the
serverlist.  Since the files is only shortly opened for write access,
this probably doesn't hurt too much for now.

The program's speed is highly dependent on the user's configuration
skills; newsflash does not do much sanity checking.

A man page for newsflash is underway.


Chris Blum <chris@@phil.uni-sb.de>
@


1.2
log
@Added printdate target
@
text
@d30 2
d38 1
a38 1
server   port   grouplist   timestamp
d40 16
a55 15
where "server" is the name or IP of a news server, "port" is the port
it listens on (usually 119 - ignored for now), "grouplist" ist the
absolute pathname (!) of a file containing newsgroup patterns (see
below), and "timestamp" is the time this server has been checked last
(seconds since Jan 1st, 1970, midnight, GMT), or 0 if it has never
been checked before.  Lines starting with whitespace or '#' are
ignored.  Though you can safely set the timestamp to zero it is
probably not a good idea to do so, because it will make your first
spool from this server very slow and possibly pretty bulky, depending
on the remote server's news spool size.  Use the included printdate
program to get the current time in seconds since the epoch.  If you
subtract a few days (or multiples of 86400), you get a good initial
value.  Make sure the serverlist file is writable when newsflash is
started (eg. set the file owner to news, permissions to 644, and make
newsflash setuid news).
d140 2
@


1.1
log
@Initial revision
@
text
@d31 4
a34 4
in the makefile; do 'su' if you have to.  Then type 'make install'.
Now create the lib directory given in config.h (BASENAME).  Create a
file "serverlist" in it.  Have it contain one or more server entries
like this:
d39 14
a52 21
it listens on (usually 119), "grouplist" ist the absolute pathname of
a file containing newsgroup patterns (see below), and "timestamp" is
the time this server has been checked last (seconds since Jan 1st,
1970, midnight, GMT), or 0 if it has never been checked.  Lines
starting with whitespace or '#' are ignored.  Though you can safely
set the timestamp to zero, it is probably not a good idea to do so,
because it will make your first spool from this server very slow and
probably pretty bulky, depending on the remote server.  Try a small
program like this to get the current time:

  #include <time.h>
  #include <stdio.h>
  main()
  {
    printf("%d\n",time(NULL));
  }

If you subtract a few days (or multiples of 86400), you get a good
initial value.  Make sure the serverlist file is writable when
newsflash is started (eg. set the file owner to news, permissions to
644, and make newsflash setuid news).
d55 1
a55 1
serverlist; it must be readable for newsflash, but need not be
d60 1
a60 1
alt.sex*,!alt.sex.watersports
d62 12
a73 10
and the like.  Note that you _can_ use wildmat expressions, and you
_should_ wherever possible, because this speeds up the NEWNEWS command
on the remote host, but let the first one or two be single groups;
this lets newsflash start the transfer process earlier.  Whitespace
terminates lines; everything behind a blank or tab character is
treated as a comment, so you can use INN's active file directly if you
like.  'junk' and 'control' are ignored if you haven't disabled this
feature in the config.h file, but these groups are processed if they
match any of the wildmat expressions.  See RFC977 for details on the
NEWNEWS command.
d84 9
a92 8
Fire up your IP connection if you have to, then run newsflash.  If it
does not seem to do anything useful, you're probably running it on
SunOS. :-) Some SunOS's libraries appear to have a bug which prevents
stdio'ish read requests to sockets from working; re-compile newsflash
with -DSCUMOS added to the OFLAGS list in the Makefile.  Using this
parameter will probably slow down newsflash a slight little bit, so
don't use it if you don't have to, because we want it to be the
fastest news retriever in the west, right? :-)
d107 1
a107 1
for different newsgroups first.  Sorting the lines in another order is
d109 2
a110 1
if all lines refer to the same server, make the group lists disjunct.
d118 3
d122 1
d128 2
a129 2
Don't over-interpret newsflash's output; it is still in the beta
state, and the texts are partly misleading.
@
